A COMPARISON OF SHORT-TERM FORECASTING 
MODELS 



Ralph Eugene Hayes 



Nava! 




A COMPARISON OF SHORT-TERM 
FORECASTING MODELS 

by 

Ralph Eugene Hayes 



Thesis Advisor: P. W. Zehna 

September 1971 



Approved faoK public, teteuie; ctl.i tiibuticn im£inite.d. 



A Comparison of Short-Term 
Forecasting Models 



by 



Ralph Eugene ,Hayes 
Major, United States Army 
B.S., North Georgia College, 1960 



Submitted in partial fulfillment of the 
requirements for the degree of 



MASTER OF SCIENCE IN OPERATIONS RESEARCH 



from the 

NAVAL POSTGRADUATE SCHOOL 
September 1971 



"TViesv, s 
H votfU 
d- / 



ABSTRACT 



Seven short-term forecasting models, two using least 
squares estimation methods and five employing variations of 
the exponentially weighted moving average method, are com- 
pared in their relative ability to produce minimum error 
variance forecasts for seven simulated time series. Each 
series was generated to enable one of the forecast models 
to be the least squared error predictor. A comparison 
methodology is developed which facilitates forecast model 
selection based on single or group series forecast per- 
formance through the measurement of model specification 
errors. A computer program is presented which may be 



isoairiea lu dtthp j ± luhc 3 ^ ± Jl ^ ^ cm c. \ • n z . mi 



■f- }-» a 



forecast models to be ranked in order of their relative 
specification error. 



2 



TABLE OF CONTENTS 



I. INTRODUCTION 9 

II. SHORT-TERM FORECASTING MODELS 13 

A. LEAST- SQUARES ESTIMATION MODELS - 13 

1. Simple Least-Squares Forecasts 14 

a. Series Generation Model 14 

b. Parameter Estimation 14 

c. The Forecast Model 15 

2. Modified Least-Squares Forecasts 15 

a. Consequences of Autocorrelated 

Disturbances IS 

b. Series Generation xModel ’ 16 

c. The Modified Forecast Model 17 

B. THE SIMPLE EXPONENTIALLY WEIGHTED MOVING 

AVERAGE (EWMA) FORECAST MODEL 18 

1. Integrated Moving Average Processes 18 

2. The Series Generation Model 20 

3. Optimal Properties of Exponentially 

Weighted Forecasts 21 

4. The Forecast Model as an IMA Process 

Generator 24 

5. Parameter Estimation Procedures 26 

C. COX'S MODIFIED EWMA FORECAST MODEL 31 

1. The Series Generation Model for the 

Exponential Autocorrelation Function 31 

2. Wiener's Optimal Linear Predictor 33 

3. The Modified Form of the EWMA Model 34 

4. Parameter Estimation 35 



3 



D. THE HOLT-WINTERS AND THE I L -WAGE TWO-PARAMETER 

EWMA LINEAR GROWTH FORECAST MODELS 35 

1. The Linear Growth Series Generation Model - 36 

2. The Forecast Models 37 

a. The Holt-Winters Model 37 

b. The Theil-Wage Model 37 

3. Parameter Estimation Procedures 38 

a. The Holt-Winters Model 38 

b. The Theil-Wage Model 43 

E. BROWN'S ONE-PARAMETER EWMA LINEAR GROWTH MODEL- 49 

1. The Series Generation Model 49 

2. The Forecast Model 50 

3. Parameter Estimation 52 

F. THE BOX- JF.NKTNS POLYNOMIAL PREDICTOR-THE 

T » " ■ rvpi /~\ T. T, 7*7 TV XT r / 

VJ 1J1\ C KrtO i v n > J >1^ Lj KJ 1 UiVWiiiv * ~ ~ ~ cr .J 

1. The General Polynomial Growth Model 53 

2 . Optimal Properties of the Polynomial 

Predictor 53 

3. Parameter Estimation 56 

4. The Relationship of Special Cases of the 

General Polynomial Predictor 56 

a. The Simple Exponentially Weighted 

Moving Average 56 

b. The Holt-Winters Linear Growth Model -- 57 

c. The Brown Linear Growth Model 57 

d. The Theil-Wage Linear Growth Model 57 

III. RELATIVE FORECAST MODEL PERFORMANCE COMPARISON 58 

A. COMPARISON METHODOLOGY 5 8 

1. Time Series Generating Model Specification- 58 

2. Forecast Model Specification Error 

Measurement 5°. 



4 



3. Forecast Model Performance Criterion 60 

4. Time Series Data Generation Methods 6 2 

a. Parameter Selection 62 

b. The Random Number Generator 62 

c. Normal Random Number Generation 63 

d. Uniform Random Number Range 

Transformation 63 

5. Single and Group Series Forecast 

Performance Comparison 63 

a. Single Series Comparison 63 

b. Forecast Comparisons Using Several 

Series - 64 

6. Forecast Model Stabilization and 

Operation 64 

B. COMPUTER COMPARISON ANALYSIS 6 5 

J. . r run r ti .i uniiantc Ctld cl OTIS Oo 

2. Forecast Model Performance Comparisons --66 

a. The Least-Squares Series 66 

b. The Modified Least-Squares Series 66 

c. The Simple EWMA Series 68 

d. The Modified EWMA Series 68 

e. The Holt-Winters Linear Growth 

Series 69 

f. The Theil-Wage Series 70 

g. The Brown Linear Growth Series 70 

IV. CONCLUSION AND RECOMMENDATIONS 74 

A. CONCLUSIONS - 74 

1. Validity of Assumptions 74 

2. Conclusions and Applications 75 

r~ 

0 



B. RECOMMENDATIONS - 81 

1. Extensions of This Investigation 81 

2. Application of Methodology 83 

COMPUTER PROGRAM 84 

LIST OF REFERENCES 91 

INITIAL DISTRIBUTION LIST - 93 

FORM DD 1473 - 94 



6 



O' > -<| > -< > 



TABLE OF SYMBOLS AND ABBREVIATIONS 



Following are the meanings of the notation most fre- 
quently used throughout this thesis. 

= the estimate or forecast of the observed value (Y^) 
of a random process {Y^} at time t. 

= the estimate or forecast of the level of mean value 
(Y ) at time t of the process {Y^} made at time t. 

= the estimate at time t-1 of the process slope (b^) 
at time t in a linear model. 

£ = a random shock or superimposed error experienced by 

the random process at time t which contributes to the 
error in forecasting the observed value. 

F = the one step ahead forecast at time t-1 of the as 
yet unobserved series value, Y^ . 
e t = the one step ahead forecast error (e t = Y^-F^.) which 
is realized when the observation Y t is taken, 
a = the smoothing constant in a EWMA model. Also, a is 
the portion of a random shock which is considered to 
be a permanent contribution to process level, a = 1-8. 

8 = the discounting factor in an EWMA model; the rate at 

which the weight given to past observations diminishes. 
8 = 1 - a . 

a 1 = the smoothing constant associated with the slope es- 
timator of a forecast model. 

8 1 = the discounting factor associated with the forecast 

model slope estimator. 8' is not necessarily equal to 
1 -a' . 



7 



the estimate of $ 2 > the slope of the linear model in 
least squares methodology. 

the autocorrelation coefficient of the series {Y^} 
with lag of one period. 

the measure of model specification error. 



8 



I. INTRODUCTION 



When a random event has a significant impact on 
society, considerable effort will be made to predict its 
occurrence. The ability to predict or forecast such an 
event is a by-product of a quantitative understanding of 
the situation, a physical model. Prediction based upon 
a behavioral model is the ideal, but forecasts can also 
be based upon recognition of regularity as well as upon 
explanation of that regularity. 

There are many situations in both industrial and 
governmental operations \vhich require forecasts for hun- 
dreds or thousands of routinely recurring events. Often 



liicoc c v o n t „■> , oU c i i as equipment tailurcs which gene 
demands for repair parts, are not influenced by adver- 
tising or other factors. In the absence of additional 
information a projection into the future must be made 
based entirely upon past observations of these events. 

It is frequently true that none of the events, individ- 
ually, are of sufficient importance to warrant the study 
and attention required to develop behavioral models. 

For these items of low cost and nonsensitive nature a 
routine forecasting system is desired which employs the 
"management by exception" principle. In evaluating the 
forecast accuracy of such systems, a cost-effectiveness 
approach must be taken. A forecast model which performs 
well in one application may be totally unsuited in another 






9 



where the forecast accuracy required is more stringent. 
Since any forecasting system selected for use must be sat- 
isfactorily adaptable to a wide range of demand patterns 
it is of interest to examine a few of the short-term fore- 
casting models which are currently in vogue. 

This thesis investigates the types of time series for 
which various forecast models are appropriate and the de- 
gree to which their forecast performance is degraded by 
changes in the forecast series generation model. The re- 
sults of this investigation should suggest which model or 
models tend to be most adaptive in the sense that satis- 
factory forecasts are consistently made for the particular 
series forecast in the study. It is realized that fore- 



04- C*. V*- a . 



-» * ' , do ^ O ■' » * -r-\ 4- fi /I ( <■* l V, T -v r 

- J -L. A W V* \JL W Jl Vll V V il Vi V 14 V y vUlvv 



somewhat subjective. Also, the results obtained will nec- 
essarily be highly dependent on the series upon which 
forecasts are based, and the series used in this study may 
not be representative of the many demand or failure pat- 
terns which must be forecast in practice. One must also 
remember that the optimality (minimum variance unbiased- 
ness) of even the least versatile forecast model may be 
demonstrated. As Bossons [Ref. 13] has concluded: 

(a) if it can be shown that the model corresponds 
to a linear transformation of the stochastic process as- 
sumed to generate the time series to be forecast by the 
model, and 



10 



(b) if efficient estimates can be derived for the 
parameters of the stochastic model and thus also of the rule 
for "adapting" the forecasts, then the optimality of any 
rule can be demonstrated. 

The decision to trade this possibly restricted optimality 
for versatility can only be made in the environment where 
the model is to be ultimately used. It should also be 
noted that the measure of optimality associated with the 
EIVMA models differs from that used in the least-squares 
models and for this reason a brief discussion of the dif- 
ference is needed. The EWMA models use the method of 
discounted least squares (D.L.S.). An excellent discussion 
by Gilchrist of this and other discounting methods is con- 
LaineJ iu Re L’c renctj lb. HLb modi f. } ea l ion ox » iie method ox 
moments, least -squares and maximum likelihood to incorporate 
discounting may offer a fertile area for future investiga- 
tion in which the modified methods are compared to the 
traditional procedures. He argues that the discounted 
estimators have more robust properties than the standard 
methods . 

The method of discounted least squares is normally used 
with models of the general form 

Y = X q ; 0) + t-0>l>*** 

where the tth observation is expressed as a function of all 
previous observations, a parameter 0, and time t. The es- 
timates of 0 are chosen to minimize the discounted sum of 
squared errors 



11 



I 'Vt-i 



i = 0 

where is usually of the form $ t ,0<3<l. In using this 
form of , a weighted average results which gives maximum 
weight to the most recent data. The estimators are easily 
updated by simple recursive relations as each new observa- 
tion is taken. The comparisons made in this investigation 
were based only on the mean squared error of forecast, how- 
ever, and no discounting was involved. It should be made 
clear that forecast performance has been measured objec- 
tively without regard to the type of optimality which each 
model seeks to obtain. There is no attempt in this thesis 
to extend conclusions beyond such inherent characteristics. 
However, it is hoped that subsequently, when operational 
time series data must be forecast using a model, the com- 
parison methodology expounded here will aid in the selection 
of the "best" model for that data. 



12 



II. SHORT-TERM FORECASTING MODELS 



A. LEAST-SQUARES ESTIMATION MODELS 

This classical prediction method has evolved, using re- 
gression analysis, by considering a functional relationship 
between the observed values of the random process and an 
independent or control variable. The General Linear Hy- 
pothesis is widely known and therefore will not receive 
more than a brief statement in this section. An excellent 
discussion of the conditions under which this approach is 
most appropriate may be found in Reference 1. 

Zehna [Ref. 1] and Coventry [Ref. 2] have investigated 
the advantages of maximum likelihood estimation procedures 

ciiici have shvv.vTi C h n l i.liSS0 liipr.hochs n .or t r. s- 

tial smoothing methods for both the constant mean case and 
the linear model, where the forecast errors are assumed to 
be normally distributed in both cases. No comparisons were 
conducted for non-normal, or auto-correlated series in which 
the maximum likelihood assumptions are not appropriate. The 
relative robustness of the least squares method for various 
series generation model specifications is one of the objec- 
tives of this investigation. The simple regression model 
also provides a standard reference point from which the per- 
formance of exponentially weighted forecast models may be 
measured. 



13 



1 . Simple Least-Squares Forecasting 
a. Series Generation Model 

The process observat ions are assumed to be a 
linear function of time, the independent variable. The 
model may be written 



6 1 + B 2 X t + e t 



(A-l) 



where is the random process observation taken at time 
X^ , and is the random component contained in that obser- 
vation. It is assumed that = 0 and Var(£ t ) = a 2 for 

all t. For this investigation 6 ^ is taken to be zero in 
the generation model, and all observations are relative to 
that level. 



Consideration of the generation model reveals 
that forecast errors will be composed of two components, 
errors of parameter estimation and the random variation from 
the true linear model, Equation (A-l). As the number of ob- 
servations increases, it may be observed that the parameter 
estimates will exhibit smaller variance until the forecast 

e random variation 

of 62 in The zero- 
572 >•• • >y n is given 



error is almost entirely composed of th( 
inherent in the process. 

b. Parameter Estimation 

The least squares estimate 
intercept form based on observations y^ 
by 

7 n 



A 




E x i Y i 

i;l 

n 

y x? 

i^i 1 



(A-2) 



14 



This form is used since all series in this comparison begin 

at the origin, X^ = 0 , Y =0. The reader unfamiliar with 

o o 

the development of this estimator is referred to References 
1 and 2. It may be observed that Equation (A-2) above is 
also the maximum likelihood estimator of 3 ^ when it is fur- 
ther assumed that e has a normal distribution. The vari- 
ance of the estimated value of the process is given by 



VAR(Y t ) = o’ 






Z (V x ) 

i=l 



where is the (constant) variance of the random component, 
c. The Forecast Model 

After parameter estimation using past observa- 
tions through time X the forecast value of the random 

series at time is given by 



F t - Y t - e 2 X f ( A 'V 

Although, as noted in Reference 1, the regression line must 
not be extrapolated too far beyond the range of X^ values 
this method is well suited for one - step -ahead prediction. 

In fact, it may be shown [Ref. 1] that Equation (A-2) is 
the zero intercept minimum variance unbiased linear estima- 
tor for • 

2 . Modified Least-Squares Estimation 

a. Consequences of Autocorrelated Disturbances 
One of the crucial assumptions when dealing 
with least-squares estimation models is the serial independence 



15 



of the disturbance term which is implied in E[ee'] = a 2 I 
and gives E [Y^ Y ] = ® f° r • Autocorrelated dis- 

turbances arise frequently in the estimation of relation- 
ships from time series data, and it will be seen that almost 
all the series used in this comparison of forecast models 
have significant autocorrelation. Some mention of the main 
consequences of autocorrelated disturbances is therefore in 
order. In the presence of such disturbances, the estimates 
of 3-^ and 32 obtained by simple least-squares estimation re- 
main unbiased, but the variances of the estimator may be 
large. This suggests the possibility of modifying the pro- 
cedure to reduce the variance. Further, if the usual least- 
squares methods are applied, an underestimate of these 
c s^iros tor ^ Micniy to ncrur. 

b. Series Generation Model 

The same linear model will be assumed here as 
before in Equation (A-l) Y = 3-^ + $ 2 ^ + e t ejcce Pt now it 
will be assumed that the disturbances or shocks are no 
longer independent random variables. Instead, a first- 
order autoregressive model is used which is given by 

e t - p Vi + 6 t tA - 4) 

where { 6 ^ > has mean zero, variance and zero covariances. 

To give a better basis for comparison of the maximum likeli- 
hood or least-squares model with the modified model, the 
same normal variates will be used to generate the sequence 
{ 6 ^ } in Equation (A-4) above as were used to generate the 
sequence { e } in Equation (A-l). 



16 



c. The Modified Forecast Model 

To predict Y t + ^ for a given value of X ^ the 
conditional expected value of Equation (A-l) may be taken, 
giving 



E[Y t+i I 



5 • • • 



^ “ B 1 + 6 2 X + E[e t+ll £ l’ * ' ' ’ £ t ] 



= 6 1 + B 2 X t + l + P£ f 

Here the result involves the conditional expectation of 
Equation (A-4) also. Substituting the value of e . t from 
Equation (A-4) gives, after rearrangement, 

E[ Y t + 1 | ei ,...,e t ] = ^(1-p) + S 2 (X t+1 -pX 2 ) + pY t . 

This can be rewritten as 



Jb i x . . ■ :;i . i r, 

■ L ■*“ .L L ; J. 



<r | — 



Pt V. 



. - 0 



a fv . ^ 

P v - v f. *1 p" - - +. J 



I£ p is known, Equation (A-l) may then be written 



V^t-l ’ Bld-P) + MV pX t-l> + S t 



(A* 5) 



This transformation satisfies in full the assumptions of the 

simple linear model, and permits the direct application of 

least squares to the transformed variables (Y^-pY ^) and 

(X^-pX^ _-^) yielding the best linear predictors. The best 

/\ /\ 

linear predictor of C Y t + 1 ‘P Y t ) is E-^l-p) + $ 2 ^ X t + l" pX t-* 

A A 

where B^ and B 2 are the least-squares estimators of the 
parameters in Equation (A-5) . This is equivalent to saying 
that the best linear predictor of Y conditional on past 

/V A 

observations, is Bj(l-p) + ^2 ^ X t + 1 ~ pX t-^ + pY^ which may be 
written more clearly as 

Y t + 1 - *1 * Vfl + PU t -( 6 i* 62 X t>)- (A ' 6) 



17 



Analysis of this form of the predictor provides some in- 
sight and permits comparison with the unmodified form of 
the model. By incorporating the available knowledge of 
the autoregressive structure of the process a term is added 
to the estimate of Y This term is the product of p and 

the estimated disturbance (actually the forecast error) of 
the previous period. 

For a further discussion of the modified model 
and the effect of autocorrelation on the linear model, 
Chapter 7 of Reference 12 is recommended. 



B. 



THE SIMPLE EXPONENTIALLY WEIGHTED MOVING AVERAGE FORE- 
CAST MODEL 



Integrated Moving Average (IMA) Processes 



l U -i_ .3 O 1 L.CU ^ WX X J. V J. > 



}-\ o 1 t p. - 



that the more recent observations of a process represent 
that process better than older data, EWMA methods which give 
larger weight to the more recent data in constructing and 
adjusting the model have been developed. It is intuitively 
clear that such discounting of older data is not applicable 
when a stationary process is observed, since even the oldest 
observations are as representative of the process as the 
most recent. 

Stochastic processes of the type for which EWMA 
forecasts are appropriate have been described by Box and 
Jenkins [Ref. 3] as integrated moving average processes. 
These processes act as though no fixed mean exists, but do 
exhibit homogeneity in the sense that, apart from local 



18 



level and trend, one part of the series behaves much like 
any other part. To clarify the description of IMA pro- 
cesses three equivalent forms of the model may be formulated 

[ Re f . 3] . 

a. The difference equation form may be written 
VY = (1-6B) e -1 <a<l 

where V is the difference operator and B is the backwards 
operator. The model may also be written using these defini- 
tions as 



A - h-i ' 8 e t-i + s t ■ 

b. The random shock form follows from the difference 
equation form by use of recursive substitution for the older 



ubsci vat 1 0 li .-> , 



t-1 



Y t = a E e i + c t 

r 3 = 1 



(B-2) 



where y = 0. 
' o 



c. The inverted form follows, writing recursively 
in terms of the observations rather than the random shocks, 



Y, = Y + e 
t t ’t 



where 



v 00 

't-1 ‘ » B Y t-j 

3 = 1 



from which form, the model can easily be written in recur- 
sive form 



Y 



t + 1 



a Y t + 




(B-3) 



19 



This is recognizable as the recursion formula for the ex- 
ponentially weighted moving average. It follows then that 
EWMA forecasts should be appropriate for IMA processes. A 
demonstration that this is true is contained in section B.3. 
The form of Equation (B-2) is most convenient for generating 
IMA series, while the form of Equation (B-3) is most often 
used for forecasting. 

2 . The Series Generation Model 

The specific EWMA generating model used in this 
study can be treated as a simple random walk process, one 
which is reasonably stable with no trends. The model is 
given by 



Y 



t 




£ 

t 



f T 1") 

7 t = Vi + Y t 

where Y t is the observed value, 7 is the true level of the 
process at time t, y^ is a slight random change from the 
prior process level, and e is a random shock or superim- 
posed error. and y are uncorrelated random variables 

with zero mean and constant variances V(e ) and V(y) respec- 
tively. It will be shown in section B.5 that the EWMA fore- 
cast model supplies the optimal linear least- squares predictor 
for this generating model. A further result, provided 0<a<l, 
is that any generating process for which the EWMA is optimal 
can be represented by the generating model above in Equation 
(B-4) even though this model may not necessarily describe 
the true generating process. It must, however, describe the 
external behavior of the true generating process accurately 



20 



or else the EWMA would not then be optimal. Stated another 
way , if a process cannot be represented by Equation (B-4) 
then the EWMA forecast cannot be an optimal forecast rule 
for that series. 

3 . Optimal Properties of Exponentially Weighted Fore - 
casts 

It has been suggested intuitively in the preceding 
discussion of IMA processes that the EWMA recursion rela- 
tion describes the process exactly. Following the approach 
of Muth [Ref. 4], it can be shown that for an IMA process 
the EWMA equals the conditional expected value of the pro- 
cess. 



tions which are adapted to the most recent information con- 
cerning the process. Let Y represent that part of a time 
series which cannot be explained by systematic factors such 



as seasons or trends in the average. Let represent the 
forecast, or conditional expectation of Y^ which is made at 
time t-1 on the basis of available information at that time 
Assume that the forecast is changed from one period to the 
next by an amount proportional to the latest observed error 
This implies that a permanent component exists in every ob- 
served error and that the level of the process subsequently 
reflects this component: 



The EWMA forecast results from a model of expecta- 




(B-S) 



This in turn yields the EWMA formula: 




i = l 



t-i* 



(B-6) 



21 



Since 3 = 1-a, the weights corresponding to previous values 
of Y do not introduce any systematic bias. 

Assume now that the observed value of the process 
can be written as a linear function of the independent ran- 
dom shocks : 



Y t ■ .£ 
i=l 



W. 

1 



t-i 



+ c. 



(B-7) 



where the shocks are i.i.d. with mean zero and variance a 2 . 

If the parameters which characterize the random 
process are known, the expected value of may be easily 
found. If it is desired to do so one period in advance of 
time t when is as yet unknown, the conditional expected 
value of Y^ given the past values c t e ‘t-2 > •••> ma Y Be 



A 




I Y t 



"t-l’ t-2 



• ] 



- E 

i = l 



W. 






(B-8) 



To relate the regression function above to the EWMA 
expression, Equation (B-8) must be written as follows in 
terms of the observations, where the values of the coeffi- 
cients o. must be determined. 



and 



8U 

II 

< > 


u . Y. . . 


j=l 


3 


Substituting 


for Y. . with the 


rearranging terms 


results in: 



(B-9) 

random shock form 



00 / 00 

r = V u. . + y w. 

t .1 ^ t-j "a t-i-j 



u i - e t-i 



(B-10) 



22 



A relationship between the parameters (Vh) associated 
with the unobserved shocks and the coefficients (u . ) associated 

v r 

with the past observations of the process, is obtained by com- 
paring coefficients of their respective expressions 



w i - °i 



W. - 0 
1 



i - 1 

. + Y u . W . i = 2,3,4,... 

i 3 i-j 



the 

the 

This 



To demonstrate that the 
above process, the weights 
inverted form are substituted 
results in the system: 



EWMA forecast evolves from 
= a(3** 1, j = 1,2,5,... from 
into the above expression. 



= a 



Uf " O ' 

n ^ ^ p 



(B-ll) 



a L B 

j-1 



J - *\1 



- j > ~ 2,5,4,... 



It can be seen from the above that = a for all i > 1. 

Writing in the random shock form and substituting 
a = results in a form comparable to Equation (B-2) 

t-1 



Y t = a £ e t -j + 

t i=1 t i 



: t* 



Consideration of this expression reveals that the 
shock associated with each time period has a weight of unity, 
but previous shock weights are constant with a weight between 
zero and one. This demonstrates that part of the random 
shock experienced in a period has a permanent effect, but 
the rest of the shock affects the process only during the 
current period. 



23 



The foregoing assumed that forecasts were always 
for one period ahead only, but it can be shown that the best 
forecasts for all future periods are the same. The proof 
may be found in [Ref. 4] and is a generalization of the above 

oo 

approach where Y, „ = W. e, . represents the forecast 

t,T i = 0 l+T t-i ^ 

T periods ahead. The unique solution for the coefficients 
results in u y ^ = a 3 , k = 0,1,2,... independent of T as 
asserted. The reason for this result is that all prior 
shocks are weighted equally and the forecasts then only es- 
timate the permanent component of the shocks. 

Muth [Ref. 4] also shows that the EWMA forecast 
rule is appropriate if the permanent and transitory com- 
ponents are independent rather than perfectly correlated 

o - - ■ . t ’imA-'l r V^-\ r\ » .. !nr» n rnr> c i' on f r~. cz a 

V* ^ nJ » V.- . 1 w _ v V*.*. . Ui- - — - - 1 >. - A -t'- - _ V. 

case however is constrained to be a function of a character- 
istic root of a system of difference equations. 

4 . The Forecast Model as an IMA Process Generator 

The familiar EWMA forecast model is identical to the 

A A 

IMA process in Equation (B-3) F t+ j = ^t+1 = a ^t + 

which may be rewritten in the random shock or forecast error 

form of Equation (B-2) by substitution of the error e t = 

A 

Y t - Y . This yields 



Successive substitutions for the oldest estimate in terms of 
the prior estimate plus the permanent component of the ran- 
dom shock yields a form equivalent to Equation (B-2) 



24 



~ ~ t ~ 

= Y q + a i | 0 and here Y q = 0 is chosen level from 

which the process is measured. The form of Equation (B-2) 

is such that a series of autocorrelated forecasts may be 

easily generated. Following Muth's interpretation [Ref. 4], 

this generation method merely adds the permanent component of 

the random shock to the current process estimate to yield the 

next period estimate. 

Since the next period has an associated forecast 
error, that error is then added to produce the desired ob- 
servation : 



Y t+1 F t+1 + £ t+l 



= a 



A 



e . + e 
1 t + 1 



which by substitution of Equation (B-2) is in a form suit- 



. r> vn . , , . - ; 



Th i 



model, having no fixed mean, is free to drift in a random 
walk. It may be observed that when is reduced to zero, 
the model no longer experiences shocks and becomes deter- 

A 

ministic at the last Y t for which e > 0. The asymptotic 
variance of the generated series may be determined by ob- 
serving from Equation (B-2) with t-* 00 that E[Y] = E[£] = 0 
and 



.2 _ 



y 



= E [ Y 2 ] - E [Y] 2 = E 



“ E Y j 

i=0 



= a 2 . (B-12) 

2 - a 



In generating the autocorrelated series, the con- 
stant a controls both the amount of correlation which exists 
and the series variance. When a=l, the series is perfectly 
correlated with the random series and has variance a 2 . 



25 



5. Parameter Estimation Procedures 



To develop optimal parameters for a specific series, 
consider a strictly stationary stochastic process with 

E[Y t ] = 0, VAR[Y t ] = a 2 , C0V[Y t ,Y t+k ] = p k a 2 . 

If the EWMA model is used for forecasts, then the mean squared 
error of prediction is by definition: 

M.S.E. = E [Y^-Y ] 2 . 

Expanding this expression, 

M.S.E. = E[Y 2 ] - 2E[Y Y ] + E [Y 2 ] 



where E[Y 2 ] = a 2 . 



(B - 1 3) 

t j - • t®- 14 ) 

A 

Substituting for Y in the second term above gives 



E[Y t Y t ] = -2E 



Y t o £ B 1 Y t-i 
z i = 0 



= -2a £ B 1 E [Y Y -] 

i = 0 



(B- 15) 



= -2a £ $ z p. a 2 = -Zao 2 J] 8 1 p- • 

i=0 1 i=0 

Expanding the third term in Equation ( B - 13) and substituting 

A 

for Y fc yields 



a 2 E 



E B 1 Y 

i = 0 z 1 



Expanding the expression formally results in 



a 2 E 



Y t + 3 Y t-1 + ** Y t- 2 + ‘ 



= a 2 W 



(B- 16) 



26 



(B-17) 



where W = E[Y t + ^ Y t-1 + ^ Z ^t-2 + 

Since {Y^} is a strictly stationary process, W may also he 
written W = E[Y t _^ + 3Y t _ 2 + B 2 Y t _ 3 + *''] 2 ' Rearranging W 
this becomes 



W - EtV^Vl * 6Y t . 2 * B 2 Y t . 3 ♦ 
which may now be expanded. Proceeding formally, this pro- 
cedure results in 



W = E[Y 2 ] + 2E[Y t 3(Y t . 1 + 3Y t . 2 + B 2 Y t _ 3 +•••)] + 3 2 W 
which may be written more briefly as 



W = o 2 + 2E 



Z eiy t h-i 

1=1 



+ 3 2 W 



- c - + 




K ** w ~ i ) + K 2 1° 



Here u is defined as the sum of the first two terms. Solv- 
ing for W, this becomes W = . Again substituting W into 

Equation (B-16) results in 



a 2 W 



au 

1 + 3 * 



(B-18) 



Combining 
(B-18) in 



Equation (B-14) , Equation (B-15) 
Equation (B-13) yields 



M.S.E. = 



2 6 2 
2 -a 



I e i_1 



2_aa 
2 



i=l 



and Equation 
p. (B -19) 



which is the mean squared error of one -step -ahead prediction 
as a function of the smoothing constant and the autocorrela- 
tion of the series. This is a result obtained by Cox, p. 415, 
Ref. 5. For a Markov series with exponential autocorrelation 



27 



k . n 
p , k > 0 



p k = 

where k is the lag between observations, the mean squared 
error for one-period ahead forecasts becomes 



M.S.E. 



2o 2 (1-p) 

(1 - 3p j (l + 3) 



(B- 20) 



The MSE is minimized for a given p by an EWMA predictor 
which has parameters / 



1-p 

5 . = < — or a 

opt \ 2p opt 



( 2 p c T 5 P 5 !) 



(B- 21) 

(-1 < p < 1/3) 



These results are obtained by equating to zero the derivative 
of Eauation (B-20) . The corresponding mean squared error of 
prediction for the EWMA with optimal parameters is obtained 
from Equation (B-20) by substitution 



M.S.E. 



So 2 p (1 - p) 
U + P) 2 



> 



(1/3 <: p < 1) 



cr 2 , (-l<p< 1/3) 



Analysis of this result discloses that when p > 1/3 it is 
optimal to predict the observed series values using a larger 
a value since there is sufficient correlation between suc- 
cessive observations to make this procedure advantageous. 
When p < 1/ 3 the EWMA should attempt to predict only the 
mean value of the process since so much uncertainty exists 
concerning the next observation. When the optimal a equals 
zero, the MSE of prediction equals the variance of the time 



28 



series. When p=a=l perfect correlation permits exact fore- 
casts and the M.S.E. is zero. 

The preceding results are obtained by Cox [Ref. 5] 
who also developed tables showing the relationships between 
p , a t , and the MSE/a 2 ratio. It is interesting to note 
that as one predicts more than one step ahead, the critical 
value of p increases. One conclusion drawn from Cox's 
tables is that MSE is not sensitive to the choice of a 
smoothing constant, and although the optimal a may be zero, 
it costs little to use a = 0.1 as insurance against a possi- 
ble change in the mean level of the process. This insensi- 
tiveness probably accounts for the success of the exponential 
smoothing model proposed by Brown [Ref. 6] for which he uses 

CITI p i. T i. C d 1 C L ' * C ^ *"* C ^ . •-» n" o r ,'. + a ^ e t "i m 2 I* T nn r\ t* c\ m ! f r*. T to ! \ i ■ 1 1 j 

a=0.1 for general use. 

An alternative approach to optimal parameter deri- 
vation for the EWMA forecast model used by Harrison [Ref. 7] 
may be taken by considering the generating model in Equation 
(B-4) . Recalling that the one step ahead forecast error 
e = Y - F = Y - Y , and that the forecast model is 

u X L TZ L 

-A A 

Y^ = Y^^ + ae^, a difference equation may be written to ex- 
press the one step ahead forecast error 

e t+l = 0-a)e t + e t+1 - E t + Y t+1 - (B-22) 

Then the expectation of the product of Equation (B-22) with 
both e ^ and e t yields the variance and first covariance 
respectively of the one-step-ahead forecast errors 



29 



V = (l-a)C + (1+a) V(e) + V(y) 
C = (1 - a) V - V(e) . 



(B-23) 



Solving these two equations for V yields 



V = 2a V( e) + VCy) , 

a(2 -a) • 1 

Differentiating this with respect to a gives the minimum 

variance parameter 



a = 



(1+4R) 2 - 1 
2R 



where R 



V(e) 

vTyT 



(B- 2 5) 



Substituting for a in Equation (B-24) it is found that the 

minimum variance is V = Y - e - and the covariance is 0. The 

1 - a 

error covariances may easily be shown to be zero by taking 
the expectation of the product of Equation (B-22) with 



l< X L.ii X 



til V- v w/ 



r% o 



A *f- r .->l I +■ 



the predictor is optimal. Although this approach was used 
by Harrison [Ref. 7], he then recommended that because of 
the usual shortage of data, general robustness, and pre- 
cision, for short-term forecasting the optimal parameter 
should be derived from a simulation of the predictor on the 
original series data. This has also been the suggestion of 
Cox, Brown and others. It appears that even though the op- 
timal parameters may be determined for particular series on 
a theoretical basis, it is faster and possibly about as sat- 
isfactory in terms of forecast results, to determine the 
parameters empirically. The insensitiveness of the simple 
EWMA model to the parameter value chosen has made parameter 
selection an area of rapidly diminishing return for the ef- 
fort expended. 



30 



C. COX'S MODIFIED EXPONENTIALLY WEIGHTED MOVING AVERAGE 
FORECAST MODEL 

1 . Series Generation Model for the Exponential Auto - 
correlation FunctioH 

Cox [Ref. 5] has shown, as noted in Section B.4, 
that the exponentially weighted moving average is an optimal 
forecast method for time series having an autocorrelation 
coefficient with lag k of the form p(k) = 8 . A recursion 
relation which is identical to the relation for exponential 
smoothing and which generates such a time series is shown 
by Naylor [Ref. 8] . This recursion relation is given by 



Y 0 = C1-6)e 0 

Y t = 8Y t-l + 



where { e . } are mutually inde peud^n i. 

L 

mean and variance a 2 . 



. . : i, *» T • -t- 1_ 

«rJi JrliJ IC.'J ^ ' i li U 



£ 

The method used in this investigation to produce 
the exponential autocorrelation function requires that the 
c be uniform random numbers in the interval [-A,A]. This 
choice of e results in exponentially autocorrelated vari- 
ates Y which have zero mean and variance equal to 



n 2 = 1 ~ A = a f-r 2 

y 1+8 e 2-a e 

where a, the smoothing constant, equals 1-8. That the re- 
cursion relation above gives the desired result may be seen 
by using recursive substitution so that Y^ = a 8** _ j + 8 t Y ( . 

or in terms of the random variable alone 



= a ^ 8 j e . 
j = 0 Z J 



+ a8^e 



31 



The mean of the generated time series follows immediately 
since E[Y] = E[e] = 0 and the variance is 



-2 _ 



= E [Y 2 ] - E[Y] 2 = E [ Y 2 ] . 

3 

Squaring the recursion relation and taking expected value 
gives a * = B 2 E[Y 2 _^] + 2aBE [e^Y^ -jJ + a 2 E[c 2 ] which gives, 
since e^ and Y^ are independent with mean zero, the result 

(1-B 2 )o 2 = a 2 a 2 . 
y e 

Therefore, a 2 = -r^rs* ° 2 > which is equivalent to the variance 
expression above. When, as in this investigation, the e 
are uniformly distributed on the range [-A,A] the variance 
becomes 



- 

’ ^ r 

y 



n a'i 2 
» •• ; 



I ° 



c: A 2 

HT+'H'j 



The autocorrelation for two numbers which are k observations 
apart in the time series is 



which implies that the autocorrelation coefficient is 
b fk - ) k 

p(k) = --- j J = $ which is the desired result. This result 

a y 

is obtained by making a transformation on the summations in 
the product Y.Y. which makes use of the stationarity of 
the series and then taking the expected value of the pro- 
duct Y.Y.. while both are expressed in terms of the random 
shocks. The transformation referred to above is 

t + k + 1 • . , ■>' t-1 n + v 

Y t+h = “ X 0 63 E t.k - j * e S a ,?n 6£ f 



j=0 P t + k-(j+k) 



32 



D t+k „k 

+ aB e_ k = B 



a ^ $•*£+. • + aB t e , 

j = 0 t_;J 



which has the same expectation as B E[Y ]. Therefore, 

T 2 

y 



E[Y t Y t+k ] = B^ E[Y 2 ] = B^a 2 = 4> ( k) from which the result 



follows immediately. 

2 . Wiener's Optimal Linear Predictor 

Cox [Ref. 5], after developing the optimal . smooth - 
ing parameters for exponentially correlated (p^ = p ) time 
series which are shown in Section B.4, sought to improve the 
model so that it did not lag behind when used to forecast 
series containing an increasing mean value. He made use 
of Wiener's optimal linear predictor for this same series 
which is given by: 

(C-l) 

Y t (opt.) - p“(Y t -y) + y = p“ Y t + (.i-p"jy 
where h is the number of periods ahead for which a forecast 
is desired, and p is the autocorrelation coefficient of the 
series to be forecast. The Wiener predictor has a mean 
square error of forecast which may be written 



M.S.E. w = (1-p ) a 2 . 

The Wiener predictor, although shown optimal when 



all assumptions were met, had two properties which made it 
objectionable for practical use. First, the parameters y 
and p are assumed known. This can be overcome if sufficient 
past observations are available for use in estimating y and 
p, since the mean squared error is asymptotically unaffected 
by the substitution of unbiased estimates for these parame- 
ters. The more serious objection to the predictor, Equation 



33 



(C-l) however, is that it becomes biased if the mean shifts, 
correspondingly increasing the mean square error of fore- 
cast . 

3 . The Modified Form of the Model 

Cox extended Equation (C-l) by substituting for 
U so that the predictor would follow shifts in the process 
mean. Equation (C-l) then becomes 



" ph Y t - ci-p h ^ t 

xv 

or substituting for Y t , it may be written 



Y m = (a+3p h ) Y t + 3 (l-p h )Y t _. 



(C-3) 



(C-4) 



Inspection of these two equations reveals that as a-*0 the 
M3E uf EqUai.jOn (C~3) dpui uSi.ilCj Lna ■ «_• i. Li't* i>ie- 

dictor, Equation (C-2), since Equation (C-4) collapses to 
Equation (C-l). Also note that when p f 0 the model Equa- 
tion (C-3) is no longer an exponentially weighted moving 
average. It is instead a moving average in which the weights 
of past observations decrease exponentially, but in which 
the current observation receives a weight which is not a mem- 
ber of the same geometric series. The weights appearing in 

/v 

the expression for Y^ in Equation (B-3) sum to one, as do the 

A 

coefficients of Y^ and Y^ in Equation (C-3), thereby main- 
taining the desired average. Another observation concerning 
Equation (C-4) is that as the number of periods ahead for 
\\rhich we wish to forecast increase, the modified predictor 
tends to reduce to the simple EWMA form. 



34 



4. Parameter Estimation 



by 



The mean squared error of Equation (C-3) is given 



M.S.E. 



2o* (l-p h ) (l-pB 2 + Bp h - Bp h+1 ) 
(1-Bp) (1+B) 



(C-5) 



which reduces to the mean squared error of the Wiener 
predictor Equation (C-2) when B=l-h. The values selected 
for a and B in Equation (C-4) will necessarily reflect a 
compromise between the desire for a minimum mean square 
error and a desire for protection against a change in mean 
process value. As Cox has shown [Ref. 5] however, when the 
optimal value of a is 0 there is insignificant change in 
mean squared forecast error for the EWMA when a is made 
larger up to the range 0.1 or 0.15. 

D. THE HOLT-WINTERS AND THEIL-WAGE TWO - PARAMETER EXPO- 
NENTIALLY WEIGHTED MOVING AVERAGE (EWMA) LINEAR GROWTH 

FORECAST MODELS 

Holt [Ref. 9] as further discussed in Winters [Ref. 10] 
proposed a forecasting model for time series exhibiting lin- 
ear growth which was a simple extension of the EWMA method 
already applied to time series with constant mean. Although 
Winters ' development explained precisely how the method 
worked, he has been criticized by Theil and Wage [Ref. 11] 
and others for failing to explicitly justify the method. 

The rationale for the criticism seems to be that Winters 
failed to formulate an explicit stochastic model as a basis 
for the forecasting method. Instead, Winters used a com- 
pletely empirical method for selection of parameters. It 



35 



must be observed, however, that Winters utilized time series 
of actual data from three dissimilar processes. His ap- 
proach is probably representative of those currently used 
when real data must be forecast. For real data, when the 
model is in fact unknown, Winters' approach implicitly as- 
sumes various underlying models and his choice of parameters 
which gave the least forecast variance is his implicit spec- 
ification of the underlying model. For the purpose of this 
investigation the criticism raised by Theil and Wage is 
considered valid, and their generating model will be used 
along with a discussion of their determination of optimal 
parameters. It may be added that Harrison [Ref.. 7] has 
postulated yet another generating model for the Holt-Winters 



> . - 1 " ’ ' “• i .1 • r r _ : 4. u „ r r 1 

* A t/UlL. LU 1 WJiJLV-.ll U-J.-LXV-.iO ^11 x. X Wm ^ i A. ^ x . 



r m *-\ f 



- - ’ * ^ ~ 



and has also derived optimal parameters. 

1 . The Linear Growth Series Generation Model 

The Linear growth model proposed by Theil and Wage 
[Ref. 11] is given by 



Y = Y + e 
t t t 



h - Y t-i + b t 



b t ■ b t-i + h 



(D-1J 



Here e is again a random shock or residual by which the ob- 
servation Y differs from its mean value Y^ and the change 
attributed to trend or the slope b is changed by a random 

amount 6^. The random variables e. and 6. are assumed to 
t t t 

have zero mean, constant variance and zero covariances. 



36 



2. The Forecast Models 



a. The Holt-Winters Model 



The Holt-Winters forecast model is given by 




CD-2) 



where k = 1 for one-step-ahead forecasts. This is the orig- 
inal exponential smoothing model in which 



The first parameter a in Equation (D-3) is the smoothing con- 
stant for the process level. Note that Equation (D-3) is 
identical to (B-3) except for the addition of the slope to 
the previous process level. The second parameter u' in 
Equation (D-4) is the smoothing constant for the exponentially 
smoothed slope estimate. Holt and Winters place no constraints 
on the relation between a and a' as do Brown [Ref. 6] and 
Theil and Wage [Ref. 11] . 

b. The Theil -Wage Model 

Theil and Wage have suggested that Equation (D-3) 
neglects available information and propose to replace it with 




(D-3) 



and 




(D-4) 




(D-5) 



which they consider a more simultaneous, rather than a re- 
cursive, approach. Taking Equation (D-4) and substituting 
into (D-5) they obtain 



A A 



Y t " “ Y t * B[Y t _ 1 *a’(Y t -Y t . 1 ) ♦ B'b^] 



= 6o'Y t + oY t + 6e , (Y J ._ 1 + bj.j) 

A 

and therefore, solving for Y 



Y = - Y + — TY + b 1 

Y t 1 - 8a ' Y t l-6a’ lY t-l D t-l J 



8_8 1 

8 < 



(D-6) 



which is of the same general form as Equation (D-3), but 
the smoothing and discount parameters are changed to obtain 
all the information contained in the observations. Note 
that when a' = 0, 8' = 1. Equation (D-6) becomes identical 
to Equation (D-3), because when . a' = 0, correspondingly, 
the slope is constant and therefore makes no contribution. 

3 . Parameter Estimation Procedures 
a. The Holt Winters Model 

The empirical parameter selection method used 
by Winters [Ref. 10] was a steepest ascent search for the 
optimal parameters. While this may be appropriate for 
critical forecasts, it is recalled that the objective of 
this comparison is finding a forecast technique suitable 
for routine forecasts of many items. Clearly, a search is 
inappropriate for each of them. Instead, use will be made 
of Winters' findings concerning the best composite parameter 
values obtained during his investigation. 

The evaluation method used [Ref. 10] to arrive 
at the best composite values was to express the forecast 
error standard deviation for each parameter combination as 
a percentage above the minimum standard deviation achieved 



38 



each product, and to add across the three products for each 
parameter combination. The best composite rating of 241 was 
achieved by parameter set (.2, .1) for (a, a'). These param- 
eters will be used in this forecast comparison with the 
observation that they may not be the best parameters for 
the particular series used here. As noted above, however, 
for mass forecasts these parameters must suffice. The ro- 
bustness or insensitivity of the model to various series 
should become apparent regardless of the parameters chosen. 

Although neither Holt nor Winters gave a pro- 
cedure for optimal parameter estimation, Harrison [Ref. 7] 
develops optimal parameters for a specific linear growth 
model 



Y + e 
t t 



Y = Y , + b + y 
t t-1 t Y t 



b t = b t -i + h 



for which Holt's predictor gives the least mean squared 
error. Following Harrison's procedure, the difference equa- 
tion form of the Holt -Winters ' Predictor given by 



Y 



t+1 




A 




A 




Y t-i * b t-i + 



ae 



t 



A 




A 



t-1 



+ a ' e^ 



may be written more compactly as 



39 



(D-7) 



where 



e t.i = ae t * be t-i ' E tn 



a = 2 - a - a' 



b = - ( 1 - a) = -3 



Expanding Equation (D-7) gives 



t + 1 



= f W. E . 

l t + l-i 



i=0 



where 



and where 



W k * a K k-1 + b W k-2 



1 lit — ^ 

' ) II /V-| * zx 9 ^ 



CD- 8) 



E t+1 = £ t+l ’ 2£ t + £ t-l + Y t + 1 ' Y t + 5 t + l‘ 



(D-9) 

t+1* 

CD-10) 

(D- 11 ) 



'1 1 2 
When X^ and X 2 ate roots of the equation 



z - az - b - 0 



it is in general true that 



W. 



X k - X k 
A 1 z 

k-1 X - X 2 



From Equation (D-10) and Equation (D-ll) it follows that the 
error variance of forecasts is given by 

oo 

V(e) = £ [(A 2 W k ) 2 V(c) + (AW k ) 2 V(5) + W 2 V(5) ] . 

k 0 

This variance error tends to a limit if jW k |< 1, which im- 
plies that |X X | and | X 2 | are less than unity also. For sta- 
bility of the expression it follows that 



40 



< 1. 



a ± (a 2 + 4b)' 2 
2 



When a substitution is made for a and b the stability con- 
ditons become 

0 < a' < ■ CD-12) 

2a + a' < 4. (D-13) 

The optimal parameters must satisfy these conditions, and 
since they will be determined as functions of the variances 
V(c), V(y) and V(5) the parameters are further restricted by 

0 < V(e) , V(y) , V (6) < V(min) . 



• l i >. > JLiU^llCo that 

within the subset 



. - <*. : . . 1 „ , jt ■» i ~ ^ — * 3 

C- i 1 i_> j ) u J. iild l [)cl i aiuv^ uC i D v» j. a x v w v^vMitaj iiv v. 

of the stability region defined by 



0 < a ' < 



(a+a ' ) 2 
2+a+ a' 



< 1 



CD-14) 



0 < a < 1 



To determine the optimal parameters, Equation (D-7) can be 
multiplied by both e t+ ^ and e t , giving two equations which 
may be written 

Cl -b 2 ) S 2 = a(l+b)c 1 + bd 2 + d o CD-15) 
aS 2 = (l-b)c 1 - d x 



where d. = E[e -E ], S 2 

J L " J L 

covariance of the errors 
tion gives 



and c^ are the variance and first 

2 

C. Eliminating S from the equa- 



41 



(1+b) [ (1 -b) 2 - a 2 ]c 1 = (l-b 2 )d 1 + abd 2 + ad Q CD-16) 

It is shown in Appendix 1 to Ref. 7 that the constants a, b, 
and V satisfy the stability conditions for the following when 
defined in such a way that 



bV = V(e) = d 2 

aV = (4 -a) V(e) + V(y) = 

V = (6- 4a+a 2 +b) V(e) + 
dj = 0 for ] j | > 2 . 



(D- 17) 

- d i 

(2 -a) V(y) * V(6) = d o 



Given these values, the right hand side of Equation ( D - 16) 
simplifies to 



V[-a(l-b 2 ) +ab 2 + a] = 0. 



(D- 18) 






A- ^ 



4- t 







^ j 



• *> /A ITT o 1 1 ri ^ I /N rY 
v* v- j- ^ * 



r - - n > 

t ^ J 



; ill) 



ting for a and b in the above, making use of the stability 
restrictions in Equation (D-14), it is seen that the coeffi- 
cient of c-^ in Equation (D-16) is 



act' [4 - 2a-a 1 ] > 0 . 

Since Equation (D-16) equals zero by Equation (D-18), it 

follows that c^ = 0 , and from Equation (D-15) and (D-17) it 
2 

follows that S =V. Since all other covariances can be shown 
to be zero by multiplying Equation (D-7) by e t _^j i = l,...°°, 
it follows that the predictor is optimal when the parameters 
satisfy 

V(e) = (l-a)V 

V(y) = (a 2 + aa ' - 2a ' ) V 

V ( 6 ) = a ,2 V. 



42 



Here V is the minimum variance of forecast error. The prob- 
lem remains, however, of estimating the variances of the 
random components of the process. Harrison suggests the use 
of Serial Variation Curve analysis of the observed data to 
assess the suitability of the model for forecasting the data 
and to provide an estimate of the variances needed to select 
the optimal parameters. By examining the first differences 
of the data, Harrison [Ref. 7, p. 833] states that the Serial 
Variation Curve of the first differences is expected to be a 
straight line with gradient V(5) for i>2, or 

E[AY t -AY t _ 1 ] 2 = 4V(e) + 2V(y) + i V(5) , i>2 
= 6V(e) + 2V(y) + V (6) , i = l. 



^ U j Uil CU U U C ^ • UU ■» V/ V A y U X O 1 i. U W V X WUX CiMpxJLN^vXWJLC/li»J J 

this method for determining the variances has a large error, 
and concludes here, as before in his derivation of parameters 
for the simple EWMA model, that the optimal parameters are 
again best determined by simulation (empirically) using the 
actual data. For purposes of this comparison, then, the 
parameter set recommended by Winters is still retained, 
b. The Theil-Wage Model 

To determine the adaptation parameters a and a' 
which minimize the mean square forecast error the forecast 
error must first be expressed as 



e t - h + b t 



- V - Y t + b t 



^t-l'^t-l “t "t- 



+b. 



+ 6+eJ 



43 



where the value of in Equation (D-l) is substituted into 
Equation (D-2) . This may be written in more concise nota- 
tion as 

e t ■ A t * B t - e t • A cd-iso 

where 



A t - Y t • Y t-i • B t - \ - Vi. 

To insure that the notation is clear, recall that the fore- 

A A 

cast of Y t is F = Y^ + b^ which can be considered as an 
estimator of Y^. expressed as 

Y t " Vi + b t-i * E t + h 

From Equation (D-19) it can be seen that the forecast error 
is the sum of three terms: 

the sampling error (A ) of the mean process level at time 

t-1, 

the sampling error (B^) of the slope change at time t-1, 
and a disturbance combination associated with the t^h period. 
Using the forecast model Equation (D-3) and Equation (D-4), 
the sampling errors can be eliminated successively from 
Equation (D-19) and it can then be written as 

Y t - Y t - C? t .r 7 t-i>-< b t-rVVi> " " ae t-r 

Further rearrangement reveals that can be written as 

A t = (1-a) (A t _ 1 +B t _ 1 ) + ae t - (l-a)6 t . (D-20) 

The same procedure permits B to be written 

B^ = -aa ' A^ _ (1 -aa ' ) B ^+aa 1 (1 -aa 1 ) 6^ . (D-21) 



44 



Finally these results may be expressed in vector form as 



where 



rA„i 




ha , 1 






t 


= P 


t-i 


+Q 


t 


B . 








6 . 


t-" 




L t-l J 




L t - 


fS 


e 1 




- 


e 1 



CD- 2 2) 



P = 



Le e 



Le - e'J 



0 = aa ' 

6 ' = 1 - 9 



Analysis of the above reflects that an optimal a and 6 imply 
an optimal a* also. By successive elimination of the vec- 
tor 

’Vi 
u't-P 

in Equation (D-10) that equation then may be written 



-v 




" A t-n" 


n- 1 


" £ t-k ' 




= p n 




+ 7 P k Q 




U t - 




LB t-n‘ 


L-t 

k=0 


' 6 t-k' 


a 1 


are positive 


(the latent 


roots 



(D-23) 



the unit circle) , the first term on the right of Equation 
(D-23) converges to zero when n-^°°. Combining this equation 
with Equation (D-19) the forecast error may be written 

f £ t-kl (D - 24) 



‘ l 1 H I pk Q 



k=0 



. 6 



t-k 



e t+l " 6 t+l 



On expanding and taking the expectation formally the mean 
square error is seen to be 



45 



MSE = [1 1] (QDQ'+PQDQ'P'+P 2 QDQ'P' 2 +- • •) [J] + a 2 + a 2 (D-25) 
where 



D = 



r °c 0 



6 J 



To simplify the MSE expression, the term in parenthesis may 
be grouped as S = QDQ ' +PQDQ ' P ' +P 2 QDQ ' P ' 2 + • • • and PSP' = 

7 

PQDQ'P' + P“QDQ ' P ' + • • • may be subtracted from both sides of 
Equation (D-25) giving 



PSP' = QDQ' . 



(D- 26) 



The result can then be regarded as a set of linear equations 
in the three elements S^, S^j $22 ^ since it is seen 

that S is symmetric. Equation (D-26) can then be written 
in explicit form as follows 



" S ll S 1 2 ~ 




- B B ' 




" S ll S 12 


1 

CD 

1 

QQ 


S 12 S 22 J 




CD 

CD 

1 

1 




S 1 2 S 2 2 


CD 

o a 

i 



“a 


- B " 




r* 2 

e 


0" 




“ a 


0" 


_e 


- 




0 


«$- 




;B 


-0J 



which after rearrangement, has the solution 



_s if 




- (2-a0) 0 


2g 2 0 


3 2 (l + 6)~ 


S 12 


1 

a0 (4- 2a- 0) 


-B0 2 


(2a(l+B) -0) 0 


6(a(l+B) _ 0) 


— S 1 




_(l+6) 0 2 


2(a 2 -2a+0) 0 


a 2 (l+6)+206- 



a 2 a 2 

e 


+ 


e 2 o| 


ado 2 

e 


+ 




0 2 a| 


+ 


CD 

ro 

Q 



(D-27) 



46 



Returning to Equation (D-26) it is noted that S must be pre- 
multiplied by [11] and post multiplied by [11]'. This sug- 
gests the form + 2S^ + $22 so ^ c l uat: i on ( D - 2 7) is 

premultiplied by [121] . The product of this vector with the 
3x3 matrix on the right side of Equation (D-27) results in 
[20 20 l+$] . 

Combining this with Equation (D-26) the mean 
square prediction error is found to be 



MSE 



S 11 + 2S 12 + S 2 2 



+ a 2 + a 2 



a 2 C 4a+2Q ) 
e a(2 (1 + 3) - 9) 



+c 



2 

6 



(1 + B) 

a0(2(l+3) -0) 



2 (g 2 (l+3)+4a0+20 2 ) 

e a0 (2 (1+3) -0) 



(D- 28) 



where 

g 2 = oj /a 2 ( D - 2 y j 

the ratio of the slope change variance to the additive 
error term variance. The nature of the random processes 
normally encountered suggests that g is often a small num- 
ber. Now that the MSE has been obtained in fairly simple 
form, the optimal value of 0 may be found by taking the 
derivative of Equation (D-28) with respect to 0 and equating 
the numerator to zero. Solving the resulting quadratic 
equation in 0, the optimal 0 value may be written 

0 = h 2 (1+3) (D- 30) 

where 

h 2 = -1/8 g 2 + jg(l+l/16g 2 ) ^ • (D- 31) 

Equation (D- 30 ) gives the value of 0 which will minimize the 
mean square forecast error, given a specific a. Recalling 



47 



that 0=aa' it is seen that G is a decreasing linear function 
of a. As a matter of convenience it has been found easier to 
work with h than with g so solving Equation (D-31) for g re- 
sults in 

g 2 = 4h 4 / (1-h 2 ) . (D- 32) 

Substitution of the 0 value of Equation (D-30) into Equation 
(D - 2 9 ) gives 



MIN MSE 
0 



2 (g 2 +4h 4 -i- 2h 2 (2-h 2 ) a) 
e h 2 (2-h 2 ) a(l+3) 



2 4h 2 / (l-h 2 )+2a 

° e a(l+3) 



(D- 33) 



Differentiation of Equation (D-33) with respect to a and 
putting the result equal to zero gives a quadratic equation 
in a. The results are found to be 



a 



2h e _ 1-h 
1+h » p 1+h 



A substitution of the results of 
tion (D-33) gives 



2h 2 

- h e= T7TT 8 

Equation (D-34) into 



(D- 34) 
= 1-h 



Equa- 



MIN MSE = a 2 = a 2 /S. (D-35) 

a , 0 



This result has been tabulated in Reference 11 for various 
values of g 2 . For an illustration of the method, however, 
the optimal parameters for the Theil-Wage forecast model 
will be computed here, assuming that a 2 and of Equation 
(D-l) are 25/3 and 4/3 respectively. 

e ~ u (- 5 , 5)=£> a 2 = (5+5) 2 /12 = 25/3 



48 



S t ~ u(- 2 , 2) a 2 = (2 + 2) 2 / 1 2 = 4/3 

from which 

g 2 = cl / ol = .16 

and 

g = 0 . 4 

from Equation (D-30). From Equation (D-13) it follows that 

h 2 = -1/8 (16/100)+|(4/10) (1+.16/16) 1 ' 5 = 0.181 
h = .425 = a’ 

a = 2 ( . 425) /1 . 425 = .850/1.425 = 0.596. 

These values of a and a' are those used in the computer com- 
parison of the Theil-Wage model. These parameters will insure 

i- ■«- F V« yn rv/1 a 1 * r r* > . f i m o 1 -C , - » F h r» c n . ’ n e . . » . v ■ a v o ■! J ^ m IX*' ~ 

cordance with Equation (D-l) where the range on e is 10 and 
the range on 6 is 4. Using the results of Equation (D-35) 
it is noted that the min. M.S.E. should be approximately 20.6. 

E. BROWN'S ONE-PARAMETER EWMA LINEAR GROWTH MODEL 
1 . The Series Generation Model 

The series generation model for Brown's forecast 
rule is simply the forecast rule with random numbers applied 
as the shocks to process level and slope. Comparison with 
Equation (E-5) will reveal that the forms are equivalent. 

The model may be written 



= 



Y + e 
t t 



Y^ = Y 



t-1 



b t + el 

L L 



b. = b 



t-1 



+ e 



(E - 1) 



49 



where Y is the observation at time t, is the process 
level at time t, b^_ is the process slope at time t and e 
is the random shock observed as a forecast error at time 
t. is a random shock experienced by the process level, 

and Brown's forecast rule writes this shock in terms of 
the forecast error = (l-B 2 )^. is the random change 

in process slope which Brown's model assumes may be written 
c” = a 2 ^. This model is essentially the linear growth 
model in Equation (D-l) with Brown's assumptions concerning 
the relationship of the various random elements. 

2 . The Forecast Model 

Brown's forecast model as discussed in Reference 7, 
is essentially a special case of the Holt-Winters model 

* : p. 1 1 nr\ Mi- \ n ri P r r i p. r nn / h - d \ i n H i rh ho rr-cf ri rf- aH t n o 

~ " » s - » v ' * * " • ~ * ---------- 

second parameter to be a function of the first. The fore- 
cast rule is identical to Equation (D - 2 ) 

F = Y + k b • (E - 2) 

To illustrate the similarity of Brown's model to the Holt- 
Winters model, the latter model may be written in the ran- 
dom shock form using the forecast error as the shock 

e t " v t - <Vi * VY- 



Equation (D-3) becomes 



A 




Y t-i + b t-i * 



ae. 



(E-3) 



and Equation ( D - 4 ) becomes 



b t = b t-i * “ a ’ c t (E_4) 



50 



When Brown's double smoothing model is written in the same 
form [Ref. 7], the similarities are easily seen as 

h = Vi * Vi * ( E - 5 ’ 

\ * Vi * c 1 ' 8 ) 2 v 

The Holt -Winters parameter a, which smooths the process 
level in Equation (E-3) is equivalent to Brown's parameter 
(1-6 2 ) or a(2-a) . The Holt-Winters' parameter product a' 
in Equation (E-4) is equivalent to Brown's a/2-a. When 
these equivalent parameters are numerically equal, then 
Brown's model will yield the same result as the Holt-Winters 
model. The forecast model used in the computer program for 
this comparison, however, is the more familar from of Brown's 

Hnr.KI a rmonl ! 1 mm , * *-» m /*» 1 » r 

'* 9 



A ^ 

Y = 2Y - 
t t 




(E-6) 


b = — [Y 
t 3 L t 


v(2)] 

t 


(E-7) 


7 t - «Y t * 


^t-1 


(E-8) 


f J 2) " a \ * 


b ; ( 2) 

pi t-i* 


(E-9) 



Equation (E-6) expresses the estimate of the lag- corrected 
current process level, and Equation (E-7) expresses the cur- 
rent estimate of the process slope. Equations ( E - 8 ) and 
(E-9) are the single and double smoothed expressions used to 
calculate the level and slope. The method involved in these 
calculations makes use of the known fact that single smoothing 
(the simple exponentially weighted rroving average of Equation 



51 



(E-8) ) lags a trend by a constant amount. The double smoothed 
version in Equation (E-9) lags the single smoothed value in 



the same amount by which single smoothing lags the observed 
trend. If Equation (E-6) is rewritten as + (Y -Y^ J ) 

the term in parenthesis is recognized to be the lag correc- 
tion required to give the current adjusted estimate of pro- 
cess level. The amount of lag inherent in Equation (E-8) 
for trend series is the observed result of using past data 
to estimate the current level. This lags corresponds to 
the "average age" of the past data used in making the es- 
timate, and the error due to this lag can be shown to be 



e t = 3/a b t 



. _ ... - r . 

JiCl ^iu 






.3 JL W M ^ w wv.. 



uu uu Uli 



(E-7), where e^ = Y -Y^} Brown extends this approach 
further to form what he terms third order smoothing, and 
higher levels, but these forms are not relevant to the pre- 
sent comparison. 

3 . Parameter Estimation 

Brown advocates an empirical approach in "fitting" 
the forecast to the series, but he also says that a=0.1 is 
a good multi-purpose value. Since the Holt-Winters parame- 
ter values were taken from the literature as their best es- 
timate of a good value combination, Brown's value will be 
taken as such also. While this value may be improved upon 
for specific, series by experimentation, this superficial se- 
lection is in keeping with the objective of this study - to 
forecast varied types of series at minimum cost. 



52 



F. THE BOX -JENKINS POLYNOMIAL PREDICTOR - THE GENERAL 
MODEL OF ORDER N 

1 . The General Polynomial Growth Model 

The generating model for which the Box- Jenkins 
Polynomial predictor is the optimal linear least square 
predictor is discussed by Harrison in Reference 7. The 
general polynomial model specifies that each "derivative" 
of the underlying process experiences a random change in 



such 


a way 


that for 


the N 


order polynomial 


the mo 


del 


is 




given 


by : 




















*-< 

r+ 

II 


Y (U + 


e t 






(F 


-i) 






y(i) 

t 


_ yCi) 
*t-l 


+ y (i + 1) + Y (i) , 

t ' t * 


i= 1 , . . 


. , n . 


(F 


-2) 


Here 


v (n+1) 
- 


.1 

O 

r 


the rfe. 


prlnm orrorc p p{ j 

“f " 


iH vCi) 
• r 


h q v 


m p 


an 



zero and variance o 2 I. Comparison of Equation (F-l) and 
Equation (F-2) with the steady model, Equation (B-4), re- 
veals that the latter equation results from i=l in Equations 
(F-l) and (F-2). A similar comparison of the linear growth 
model Equation (D-l) , reflects that this form results from 
setting i=l,2 in Equation (F-2). The particular model in 
Equation (D-l), has and = 0 but could 

easily have had a positive value and nothing in that analysis 
would have changed except the complexity of terms in the de- 
velopment . 



2 . Optimal Properties of the Polynomial Predictor 
For the assumed generating model the Box-Jenkins 
polynomial predictor can be shown to be optimal in the lin- 
ear least squares sense and the optimal values of the 



53 



forecasting parameters cu can be derived as functions of the 
variances of the random shocks e and • A special case 

of the Box-Jenkins result derived on pp . 312, 313 of Ref- 
erence 14 will be employed and the procedure followed by 
Harrison on p. 824 of Reference 7 will be used to show the 
result . 

Let a stochastic process by generated by the model 



a-; 



+ e 



t + 1 



(F-3) 



where the {e^.} have zero mean and are identically distributed 

uncorrelated random variables, and the n • are constants. Con 

3 

sider a forecast rule of the form 



Y. ... - V. Y . 

r + i : l - i 

j-o - 



(F-4) 



where the y. are constants. Since it is true that 
3 



(F-5) 



[“W 2 ’ E[Y t + r Y t + i ' 2 - E t .£ 



the forecast rule xvill be optimal in the least squares sense 
when y j = n ■ • When this is true, Equation (F-5) implies that 
the forecast errors {e^} are identical to the random shocks 
{e^.}, and therefore are also uncorrelated. By successive 
substitution, the Equation (F-4) form of the forecast rule 
may be transformed in terms of the forecast errors as 



t+1 



= Y. 



E 

i=0 



W. 

3 



't-j 



(F-6) 



where the W. are 
3 

ference equation 



cons tants . 
form 



Writing Equation (F-6) in dif- 



54 



(F-7) 



A 



VY 



t + 1 



oo 



E 

j=0 



w. 

J 



e 



t-j 



it follows from the previous analysis that this forecast rule 
is optimal for the equivalent underlying stochastic process 



VY 



t + 1 



E 

j=0 



w 



j r 't-j 



+ Ve 



t+1 



since the forecast errors were equivalent to the random 
shocks. Specifically, the forecast rule is optimal for a 
series generated by the model 



VY 



t + 1 




+ Ve 



t + 1 



(F-8) 



where S- 1 is used to denote the j th multiple sum. Differenc- 
ing Equation (F-8) n-1 times, Harrison arrives at the expres- 
s ion 



n-1 

V Y^, = V y. V . 
n t + 1 jt'o r J n-i- 



1 £ t 



(F-9) 



which has no error terms beyond e t . n+ ^* The result which 
has been found states that if a random observation y has 
the property that its n^h difference can be represented as 
a moving average process of iid variables {c^} which have 
zero means and the process is of order n+1, then the Box- 
Jenkins n order polynomial predictor given by 



•a a n-1 . 

Y t+1 ■ Y t + E n s 1 e t 



(F-10) 



is optimal and the represent the one-step-ahead fore- 

cast errors for this optimal rule. 



55 



3. Parameter Estimation 



The procedure used by Harrison [Ref. 7] shown in 
section B-5 for the simple EWMA model demonstrates the 
Box-Jenkins parameter estimation procedure for the steady 
model. The optimal predictor for the linear growth model 
is the form of the general predictor recommended by Holt- 
Winters. Harrison has derived [Ref. 7] the optimal parame- 
ters for this model as previously described in Section D.3.a. 

4 . The Relationship of Special Cases of the General 
Polynomial Predictor 

The forecast models proposed by Holt-Winters, Theil- 
Wage , and Brown are specific forms of the Box-Jenkins Poly- 
nomial Predictor. The demonstrate the relationship, Equation 
(F-lcn can be written in the form 



where 



and 



n 



? = V* Y 

t+1 .L,. t 

i= 1 



CD 



= V YpJ + a .e 

t t-1 It 



6 t - Y t - Y t 



(F-ll) 



(F-12) 



This form resulted from expressing the Y in terms of the 
errors and substituting in Equation (F-ll) . 

a. The Simple Exponentially Weighted Moving Average 
The first order predictor, where n=l in Equation 
(F-ll) causes that expression to reduce to the simple EWMA 



model 



A 



t+1 



A 




56 



A 




A 



t-1 



where equivalence to Equation 
for the error in terms of the 



+ ae t 

(B-3) is seen by substitution 
observation and forecast, 



Y t = y t+ i - oY t * Y t- 

b. The Holt-Winters Linear Growth Model 

When n=2 in Equation (F-ll) , the second order 
predictor becomes 

A A A 

Y fi - ? t * b t 

A A A 

Y t = Vl * b t-l + “ e t 



b t = b t-i * a ' e t 



— ( 2 ) 

t.rl-v -r- 0 V v J ~ c* T.ryi f f a e K 



c. The Brown Linear Growth Model 

Brown's second order predictor is a particular 
form of Holt's, where the two parameters a, a' are restricted 
to be functions of the discounting parameter B, 

a = 1 - B 2 



a' = (1 -B) 2 • 



d. The Theil-Wage Linear Growth Model 

The Theil-Wage Predictor restricts the Holt- 
Winters Parameters so that 



a 



2h 

(TTTT ’ 



a ' = ha . 



57 



Ill . 



RELATIVE FORECAST MODEL PERFORMANCE COMPARISON 



A. COMPARISON METHODOLOGY 

1 . Time Series Generating Model Specification 

When forecasts of a particular time series are at- 
tempted, the forecaster's belief concerning the underlying 
generating process governs his selection of a specific fore- 
cast model. Although recognizing that few models completely 
describe the complexity of practical economic or physical 
processes, by his selection of a forecast model the fore- 
caster thereby indicates that the time series has an under- 
lying stochastic process whose functional form is suggested 
by the forecast rule. The adaptive parameters chosen for 



. y ■c-liC d \ 



JU Wi. C- i l \ 



'ilv X CL 



ting process. The degree to which the actual underlying 
process differs from the assumed form is reflected in the 
forecast accuracy obtained by the forecast model. 

2 . Forecast Model Specification Error Measurement 
Any comparison of relative forecast accuracy is 
fundamentally an attempt to determine which of the forecast 
models is the more accurate specification of the process 
generating the series being forecast. The measure of speci- 
fication error used in this comparison will employ the method 
discussed by Bossons [Ref. 13] , who defined specification 
error as the additional variance of forecast errors introduced 
by misspeci f icat ion of the generating process. This measure 
cannot be determined when comparison is attempted using actual 



58 



time series data, since the precise underlying model is un- 
known. Since the various series examined in this compari- 
son have been generated from known process models for which 
the optimal forecast model and parameters are also known, 
it is possible to determine 

VAR(Y t -Y**) 

£ = VAR(Y t -Y*) 1 

as a measure of the model specification. If {Y } is the 
series being forecast, then {Y**} is the series of forecasts 
generated by the forecast model whose associated specifica- 
tion error is being measured, and {Y*} is a series of optimal 
(minimum asymptotic variance) forecasts for the series {Y^}. 
Th Q lover t>mmd on f is zero, and is obtained when {Y**} = 

L. 

{Y*}. No upper bound on f exists. Positive f values reflect 
the extent to which a particular forecast series has been 
degraded by misspecif ication of the underlying stochastic 
process. This measure, as Bossons [Ref. 13] has observed, 
has two uses. First, it permits the effect of a known mis- 
specification , such as a model simplification, to be measured. 
This emphasizes the relative importance of various coeffi- 
cients or parameters in the model, and its sensitiveness to 
change. Second, it permits the robustness of a forecast 
model to be measured by reflecting the forecast accuracy 
for various types of series misspecif ications , such as cor- 
relation, linearity, or for specific distributions of the 
random variables in the process. 



59 



3. Forecast Model Performance Criterion 



The measure of model specification error permits 
a preference ordering or ranking of forecast models to be 
made for any given time series generation model. It is 
often true that a collection of these series must be fore- 
cast, and few of the series are represented by any one 
generating model. It would, of course, be possible to anal- 
yze all the series and group them according to these process 
relationships, and for some processes this expense is justi- 
fied. For many series however, as mentioned in the introduc- 
tion to this study, it is not critical that the forecasts be 
exceedingly accurate, and the forecast costs must be kept to 
a minimum, consistent with the benefits obtained from such 
roi ecasis . au ycu* *. i iuiiaj uj mCa^ui viovu . .hC i vi.orc 9 

is needed to permit a preferred ranking of forecast models 
over a collection of series to be forecast. It may be true 
that one or two forecast models may provide the necessary 
accuracy over the entire collection of series. However, it 
is recognized that this is extremely situation dependent. 

The method used in this study to suggest possible "best" 
forecast models for use in predicting a collection of dis- 
similar time series is the calculation of the average speci- 
fication error over all series forecast, and the sample 
variance of the specification error. Selection of a model 
based on the estimated mean specification error would, of 
course, imply that a few forecasts which were grossly erro- 
neous could be tolerated, while selection based upon esti- 
mated specification error variance would suggest that a 



60 



uniformly high level of accuracy was not required but that 
no extremely poor forecasts were acceptable. Some combina- 
tion of the two criteria may even be considered. The con- 
clusions of this study will be restricted to rankings in 
terms of the mean and variance of specification error. No 
valid general interpretation can be given to these conclu- 
sions, since they are applicable only for the class of 
models and distributions chosen for use in this study. 

The method would, however, be applicable to the 
relative comparison of a group of forecast models when 
forecasting actual time series, if one were willing to 
assume that the particular model w r ith the least mean square 
forecast error was "optimal" for a particular series. Even 
with bucxjliL aliuii error tuscu cnly upon the "best' model, 
rather than upon a truly optimal (minimum variance error) 
model, it would still be possible to draw meaningful con- 
clusions concerning the relative effectiveness of a group 
of models. The same approach would apply when searching 
for a general purpose parameter for use in a single model 
which must be applied to a collection of series. It is un- 
likely that an optimal parameter for one series will be op- 
timal for all. Use of this method on a representative sample 
of series would facilitate selection of the parameter wh i ch 
minimizes the average forecast error variance or some other 
selected criterion for the entire collection of series, and 
not just for one particular series. 



61 



4. 



Time Series Data Generation Methods 



a. Parameter Selection 

For each of the forecast models discussed in 
Section II a time series was generated using the underlying 
process generation model associated with that specific fore- 
cast rule. The random shock forms of those rules only re- 
quire that the parameters, such as intercept and slope, be 
specified and that the random shock be added to provide the 
stochastic, element for the series. It was determined during 
the course of the study that the variance of the random ele- 
ment caused little or no change in the specification error, 
so no attempt was made to produce results over a range of 
model parameters. The reason for this was that, even though 

lU i t- O 1 i u i v ai laJi c. o vv cr o cnuiig, c u \J 2, C. j. a. m C C 12 , 2. w. *. T UC 

changed in generally the same proportion for all and the ratios 
remained approximately constant. In no case would the rank- 
ing have changed due to the parameters selected. Parameters 
used to obtain the representative results contained in Tables 
I and II may be determined by consulting the computer program. 

b. The Random Number Generator 

The random number generator upon which all sto- 
chastic series properties are based is a function called 
URN which is contained in the Naval Postgraduate School IBM 
360-67 computer. This additive generator was selected as a 
standard instead of available multiplicative generators or 
the conventional RANDU since it was almost three times faster 
than RANDU and had been subjected to statistical tests which 



62 



were on file at the computer facility at USNPGS. As a 
known quantity, questions involving this generator and its 
effect on results should be quickly resolved without ad- 
ditional statistical testing of the generator. 

c. Normal Random Number Generation 

The normal random numbers used with the Least- 
Squares Models were generated from the uniform (0,1) URN 
output by using the Central Limit approach outlined by 
Naylor [Ref. 8] on pages 92-93. This amounted to summing 
twelve uniform (0,1) numbers and subtracting six to produce 
a normal (0,1) number. 

d. Uniform Random Number Range Transformation 
The uniform random numbers were transformed 

to various ranges (page 79, Ret. t> J using the expression 

Y = A + (B-A)e t 0 < e < 1 

which is a rearranged form of the uniform cumulative distri- 
bution function. Since A = -B in this study, this expres- 
sion can be written as 

Y t = (2e t -l)B = (e -0.5)2B 

which is the form used in the Fortran program to transform 

the random numbers to desired ranges. 

5 . Single and Group Series Forecast Performance Com - 
panson 

a. Single Series Comparison 

After a time series was produced from one of the 
generating models in the computer program, all forecast models 



63 



were used to forecast each observed series value, given the 
past values. After each forecast the forecast error was 
stored and after all models had completed forecasting the 
series, the average forecast error and the sample variance 
of this error were calculated. From this the sample es- 
timate of the specification error (f) discussed in Section 
2 above was computed for each model. Table I contains data 
for representative comparison made using this method of 
measurement . 

b. Forecast Comparisons Using Several Series 
When the single series comparisons had been 
completed, the average specification error and estimated 
S CtUlp 1 van ance WCTC til Op I rnl p f&A fnr p p, r h fnrpr^i^f mo dpi 

This is the performance criterion discussed in Section 3. 

As noted there, the results may be interpreted only in re- 
lation to the specific series combination examined. Table 
III contains representative data for comparison of series 
using this criterion. 

6 . Forecast Model Stabilization and Operation 

A stabilization period of 100 observations was used 
to remove any effects due to improper starting conditions 
and then the forecast error was compiled over the next 300 
observations. The specific initial conditions for each model 
which were used to obtain the representative results in 
Tables I and II may be determined by consulting the computer 



64 



program. Sufficient comments have been provided to assist 
in identification of program components and the variables 
used in each. Although the results represent only sample 
estimates of average forecast error and forecast error 
variance, the number of observations forecast was considered 
large enough to provide a reasonable estimate of the actual 
values . 

B. COMPUTER COMPARISON ANALYSIS 

1 . Prior Performance Expectations 

The forecast models compared were selected with 
certain a priori relative outcomes in mind. Due to the 
nature of the assumptions on the models and the generated 
cerip^ rs^^nri ated ’with those models it was anticipated that 

a. each mode] would be "optimal" (have the least 
estimated mean squared error) for its corresponding time 
series . 

b. the modified least- squares forecast model would 
produce smaller forecast errors than the simple least- 
squares model due to the correlated nature of most series 
used in the study. 

c. the forecasts generated by the simple EWMA 
model would tend to lag behind the linear growth processes 
by a relatively constant amount. 

d. the Cox-modified EWMA would tend to track the 
linear models better than the unmodified model. 

e. the Holt-Winters model would probably perform 
better overall than any of the models above since it is 



65 



suitable for both the autocorrelated data and for the 
linear model used in five of the generating processes. 

f. the Theil-Wage model would have the same 
general characteristics as the Holt-Winters model, but 
probably provide better response to process changes due 

to their use of a more "simultaneous" approach (use of the 
most recent slope rather than the use of the slope calcu- 
lated last period) than that used by Holt and Winters. 

g. the Brown model would have generally the same 
characteristics as the Holt-Winters model, but not perform 
quite as well due to the restrictions which Brown placed 

on forecast parameters. It is argued by some [Ref. 7] that 
Brown's more parsimonious model can achieve essentially the 



. - *.1 ■* 4 - A , > .1 . - -i V 

OdJilV^ xu 



'i* o •{• Vi vn / \r , A i 1' !', ^ 7* f* f\ H O I T 



Winters model or the Theil-Wage model. Quantitative evi- 
dence to support or refute this claim was one expected result 
of this investigation. 

2 . Forecast Model Performance Comparisons 
a. The Leas t- Squares Series 

The least-squares forecast model performed bet- 
ter than all others in predicting this series, as anticipated 
Comparison of the process variance of 9.13 (see Table II) 
with the forecast error variance of 8.98 tends to confirm 
that the leas t- squares forecast model produced optimal pre- 
dictions. The next best forecast model for this series was 
Brown's double smoothing model with an 11.41 larger fore- 
cast error variance. 



66 



It may be observed (Table I) that the modified 
least- squares model was ranked fourth after the Holt-Winters 
model. The unnecessary correction for correlation in the 
random shocks degraded the quality of forecasts generated 
by this model. Note also that the simple EWMA forecast 
model lags the observations by 3.036 (see Table I). This 
was expected since this model specifies that the forecast 
for the next period (or any future period) is the current 
level of the process, and slope of the linear model was 
selected to be 3.0. The EWMA model was thus never able to 
anticipate the change due to slope. This is the refinement 
incorporated into Brown's double smoothing method and it 
appears to be effective in that model. 

b. The Modified Leas i -Squa i es 3eiic> 

The results for this series were much the same 
as for the simple least- squares series except that the least- 
squares models reversed their roles. The modified least- 
squares model produced approximately the same forecast error 
variance (8.98) from the correlated series as the least- 
squares model had obtained previously on the uncorrelated 
series (see Table II) , and the least- squares model produced 
almost the same specification error for this series as the 
modified model had obtained when forecasting the uncorrelated 
series. The change from uncorrelated to correlated normal 
random shocks affected the other models in varying degrees 
(see Table I). The least- squares model and the Brown model 
appear to possess equivalent capabilities to forecast the 



correlated series, while the modified EWMA model continued 



to be the least desirable for use in forecasting the series. 

c. The Simple EWMA Series 

The simple EWMA model, although its forecast 
error variance was not as low as the series variance (see 
Table II), gave the least forecast error variance for this 
series. Almost all models seemed to be capable of forecast- 
ing this series adequately, but the Theil-Wage model v;as the 
least desirable by far. The non-linearity of this series 
became noticeable in the relatively poor performance of the 
least-squares model, but the effect was not pronounced due 
to the limited range of the random walk in the series gener- 
ating model. The modified least- squares model was more 

i -» r n . ' r- •- 1. • T 2 — _ 

CrtlJcUJJ.t UJL lUi CUrtMlliy L J i _L ^ 1 _L O L? uuc uw s c ^ au m u v w 

i 

use the added information contained in the series correla- 
tion . 

d. The Modified EWMA Series 

The modified EWMA forecast model developed by 
Cox generally provided the least forecast error variance, 
but it may be observed (Table I) that the modified least- 
squares model produced almost the same results for this 
series. The modified EWMA model does not share this versa- 
tility when forecasting the modified least-squares series. 
The next ranked model in terms of least forecast error was 
the simple EWMA model, but its error variance was 18.4% 
greater than the modified EWMA model. The modified least- 
squares model provided a 0.006 specification error during 



this particular comparison, but on others it reflected a 
slight negative error implying that it performed better than 
the "optimal" model. Based on the simulation results there 
appears to be no real difference in forecast accuracy re- 
gardless of which of these two models were used. Reference 
to Table II reflects that these two models were able to re- 
duce forecast error variance substantially below that ob- 
served in the series. This suggests a true predictive cap- 
ability not possessed by the other models whose forecast 
error variances were each approximately equal to or greater 
than the series variance. 

e. The Holt-Winters Linear Growth Series 

The Holt-Winters forecast model achieved the 



l. XWi. wu^> v 



v< /> v uq * * • . . , n 



ovn^-rt pH K ” ^ 

1' - - w > 



T It "! ^ p -p o y _ 



mance was only slightly better than that shown by the Brown 
model (see Table I) . A few comparison trials have resulted 
in the Holt-Winters model obtaining a smaller forecast error 
variance for the Brown growth series, and the Brown forecast 
model demonstrates a similar capability to perform better 
than the Holt-Winters model on the Holt- Winters series. 

These outcomes were regarded as sample variations, but the 
implication is obvious. The results of these models are so 
comparable that it would be difficult to conclude that any 
real difference existed between them (for this series) . The 
next best performance (the Theil-Wage model) resulted in a 
72.61 increase in forecast error variance. 



69 



£. The Theil-Wage Series 

The Theil-Wage forecast model proved to be 
distinctly optimal for forecasting its assumed underlying 
process. Previous comparisons have shown that the various 
observed series could have been generated by several slightly 
different models, as evidenced by the comparable forecast 
performance of the several forecast models. Here, however, 
the next best forecasts were obtained by the simple EWMA 
model with almost four times the forecast error variance. 

The minimum error variance of 18.14 (see Table II) exhibited 
by the Theil-Wage forecast model is much larger than the 
series variance of 7.99, but it compares favorably with the 
predicted theoretical mean square error of 19.7 obtained by 

~ -C r* ^ fr\ T 4 A 

UJ V WX V-XUil ^ is sJ I J • 

g. The Brown Linear Growth Series 

The results obtained while forecasting this 
series tend to further reinforce the observations made con- 
cerning the Holt-Winters results. The Brown double smooth- 
ing model obtained the least forecast error variance, but 
the Holt-Winters model only slightly exceeded that minimum 
value. Other models tended to forecast this series some- 
what more accurately than the Holt-Winters Series, but the 
same performance ranking resulted (see Table I) . 



70 



TABLE I. MODEL SPECIFICATION ERROLS 1 AND AVERAGE FORECAST ERRORS 







o 


CN N- 


cn l o 


* 3 - to 


rH 00 




vo 


a 


rH O 


oo rN 


i — i cn 


CN 


O LO 


vO oo 




o 


£ 


rH O 


rH O 


o o 


CN o 


o to 


O CO 


o 


o 


o 


• • 


• • 


• • 


• • 


• • 


• • 


• 


• 


P 

PQ 


o o 


o o 


o o 


o o 


o o 


to o 

CN 


o 


o 



1 


rH rH 


o o 


tO CN 


vo cn 


CN H* 




CN rH 




rH 


TO CN 


cn cn 


rH rH 


CN o 


CN CN 


rH 


Cn rH 




•H CD 


rH O 


vO O 


VO O 


IN o 


IN O 


O O 


VO O 




CD bO 


• • 


• 


• 


• 


• ♦ 


• 


• 




A 

H ^ 


rH O 
1 


o o 

1 


o o 


o o 


o o 


o o 

1 


o o 




t/) 


p 


n- on 


IN- LO 


CN tO 


LO H* 


IN 


IN IN 


cn o 




i CD 


H- rH 


CN tO 


^ IN- 


O CN 


cn 


to cn 


O rH 




4-> H> 


rH O 


CN O 


O O 


to o 


O rH 


00 rH 


O O 




d 


• 


• • 


• 


• 


• • 


• • 


* 




O -H 
X ^ 

X 

CD /-n 


o o 


o o 


o o 


o o 


o o 

1 


rH O 
rH 


o o 


in 

CD 

P 


•H < X 


LO rH 


O Nj- 


00 VO 


cn 


to VO 


O 


cn cn 


rH 


m g ° 
•hSu 


CN H- 


rH rH 


CN 00 


LO 


vo cn 


rH IN 


vo to 


cd 


00 O 


CN O 


O CN 


o o 


H- VO 


00 rH 


cn n 


> 




• • 


- 


• 


• ♦ 


• • 


• 


• ♦ 




O 


to lo 

i 


to LO 
1 


o o 

f .» 


o o 


rH CN 

1 


CN CN 


O rH 

1 


Jpper 




















rH < 


CNl tO 


IN tO 


IN 


OO rH 


vo 


iN I'O 


U1 LO 


o 


Png 

a & 


o o 


H- O 


o o 


rH O 


cn vo 


IN O 


IN o 


rd 


• 


• 


• 


• • 


• 


• • 


• 


4-> 


•H W 
CO 

X 

CD CO 


CN) tO 

1 


rH tO 


o o 


o o 


O rH 


CN CN 


O rH 


rs are 


•H CD 


H - cn 


n 


LO 


VO O 


00 


rH OO 


CN LO 


o 


Mh +j P 


t rH 


rH 


IN O 


O LO 


cn oo 


CN VO 


O 


p 


•H CO Cd 


rH O 


O O 


O CN 


o o 


OO vo 


H" CN 


LO 


p 


X cd P 


• 


• 


• 


• 


. 


• 


• 


X 


o cd cr 

X h-3 CO 
CO 


o o 
1 


o o 
1 


o o 


o o 


cn a> 

rH 

1 


2346 

103 


to to 


d 

o 

•H 

4-> 

cd 


CD 


o 


to cn 


O N 


IN LO 


o to 


o cn 


cn in 


o 


4-> P 


CN) 


OO rH 


IN CN 


rH 00 


t CN 


O rH 


00 CN 


•H 


to cd 


o o 


rH O 


to to 


CN O 


CN O 


oo cn 


H- 00 


m 


cd P 
















•H 


cd cr 
X CO 


o o 

1 


O O 

l 

to 

CD 


o o 


o o 


LO VO 
to rH 

1 

m 


6501 

172 


cn lo 

i 


Model Spec 








cd 




/ — \ 


CD 


CD 








P 




X 


+-> 


bO 








x cr 




x o 


d 


cd 






m 


CD CO 




CD CD 


•H 








CD 


• H 


<D 


•H n- 7 




i 




4-> 


P 


MH 4-> 


rH C 


x 




rH 


d 


W 


cd 


•H CO 


PUg 


•h <r 


4-> 


•H 


£ 


cd 


P 


X Cd 




rH 


CD 


o 


cd 


cr 


O CD 


•H X 


o 


o 


XX 


p 


X 


CO 


r^-> X 


CO 


r: cr 


X 


H 


PQ 



71 



Average Forecast Errors are the Lovjr Values 



TABLE II. SUPPLEMENTARY FORECAST MODEL PERFORMANCE DATA 



C 

£ 

o 

p 

PQ 



rH 

Oh 



CO 

r^ 



CO 



tO 

K 1 



o 

vO 



vO 



CD 

£ 

o 

to 

p 

o 

Pd 



o w 



TO 



to 



0 




-f 


rv 1 


'O 


r-l «< 


Co 


to 


i — i 


CD 

6 


1 


t>* 


LD 


rH 


•H W 


rH 


r— 1 




CO 










nd 










0 


to 






to 


•H 


0 


CO 


O 




<P P 


p 


cn 


i — t 


LO 


•H CO 


aj 


• 


• 


# 


nd aj 


3 


co 


Oh 


to 


O 0 


c r 






to 


S x 


co 










to 






LO 




0 


00 


to 




p 


P 


cn 


rH 


to 


to 


aj 


• 


• 




as 


D 


oo 


CO 


LO 


0 


cr 






to 


x co 






Oh 



vO 



rH 




Oh 


LO 


Oh 


•H 0 


rH 


Oh 


vO 


O 


0 bO 


• 


• 


• 




X aS 


OO 


o- 


o 


O 


H ^ 


rH 








to 






LO 


00 


P 


(XJ 


nj* 


Oh 


rH 


i 0 




nt* 


t>. 


OO 


P p 


• 


• 


• 


• 


rH £ 


00 


oo 


rH 


vO 


O -H 








rH 


DC ^ 










nd 










0 








to 


•H ^ 


oo 


o- 


LO 




<P < X 


Oh 


VO 




o 


•H S O 


• 


• 


• 




nd Eb U 


03 


to 


rH 


CNl 



fsl 

00 



P 

tO 

0 

P 

0 

P 

a 

•H 

m 

o 

to 



0 
nd 
O 

e 

p 
CO 
aj 
O 
0 
P P 
.h rt 



G 

o 











nd 


p 


rH 




P 




£ rH £ 


to 


0 




i — 1 O 




as 0 o 


aj 


’X 




0 P 




Mh nd -h 


U 


o 


//) 


nd P * 




COOP 


0 


s 


/o 


O w 


0 


aj ^ aj 


P 




/‘H 


S O 


to L) 


0 0 U 


O 


p 


/ ^ 


P u 


0 £ 


Pt] 0 rH *H 


m 


to 


/ ^ 


rH tO a 


• h as 


£ aS P 




aj 


/ co 


aj aj as 


P -H 


0 Cj £ *H P 


<p 


U I 




S O *H 


0 P 


rH »H »H U O 


i — i 


0 / 


0 


•H 0 P 


co as 


CD P P 0 P 




P / 


£ 


P P as 


> 


^ aj CD CD P 




o / 


•H 


CD O > 




as > O co W 


* 


d-w 


H 


O pc, 




co 





P O 

0 i — t 

X CTJ 
P O 
O 

0 
XX 
c P 

OS 

X 

^ ,£) 
O 

m nd 

0 

0 G 

L) *H 

a e 

as p 
•H 0 
P P 
as 0 
> nd 

P 0 

o x 

P 

M X 

0 aj 



to 

0 

•H 

P 

0 



4 - 



CM 

t 



72 



1 v 1 ' Dpt 

which is a form of the Equation [(A _ l) Section III] used to determine the specification 



TABLE III. SAMPLE AVERAGES AND VARIANCES OF SPECIFICATION ERROR 



CD 


















bO 


















ai 




































j-H 


















•H 


















CD 


















X 


















H 


CD 


















U 


i— i 


oo 


eg 


vO 


TO 


CD 


CD 


bO 


d 




LO 


oo 


vO 


rH I 


eg 


O 


C 


ai 




o 




? — 1 


O 


o 


O 


•H 


•H 


• 


• 


• 


• 


• 


• 


• 




u 




eg 


o 


eg 


o 


o 


o 


3 


a 


VO 


eg 












!— 1 


> 


f— H 














a 


















X 


















w 


















c/i 


















CD 


















•H 


















Jh 


















CD 


o 
















CO 


to 


Cvj 


vO 


to 


eg 


vO 


eg 


i-H 




ai 


CO 


LO 


o 


r— I 


rH 


eg 


to 


X 


Jh 


00 


vO 


cn 


CO 


eg 


vO 


cn 


•H 


O 


LO 


e- 


oo 


LO 


r— 1 




o 


CO 


> 


• 


• 


• 


• 


• 


• 


• 




< 




eg 


o 


rH 


O 


o 


o 




















CD 


















> 


















o 



















CO 


















CD 


















•H 


CD 1 


1 * 


X 








vO 




U 


o 


X 


X 


e- 


to 


00 


LO 


1— 1 




g \ 




y 


r\i 


H' 


« 


rr\ 


1 — i 


r r\ 


! 


! V 


y 


rr. 




r-?» 


^ — , 


vn 




♦H 


X 


X 


- 




• 


• 


. 


c 




X 


X 


O 


eg 


vO 


O 


*^r 


CD 


ai 


X 


X 






i— I 




vo 


> 


> 


X 


X 












CD 


















CO 


















r—\ 


















i— I 


CD 


LO 


e- 


to 




LO 


to 


"H- 


ai 


bO 


'vT 


to 


LO 


LO 


cr> 


LO 


e- 




ai 


to 


LO 


t-H 


e- 


e- 


vO 


to 




Jh 


• 


• 


• 


• 


• 


• 


• 


CD 


CD 


LO 




i-H 


i-H 


j— i 


o 


to 


> 


> 


to 


to 












O 


< 


CD 


to 













i — i 
















CD 


CO 


c/) 




< 










CD 


ai 




| 


CO 






O 


5h 


CD 


< 










ai 




1 


w 


CD 


CD 






3 


CO 




+-> 


bO 




+-> 


cr 


CD 


w 




a 


ai 




c/) 


CO 


CD M 




CD 


•H 






ai 




•H ai 


CD 


•H 




i 




a 


4-> 


m D 


i— I 


m 


i 


i— ! 


£ 


CD 


CO 


•h cr 


a. 


•H 


+-> 


*H 


£ 




ai 


T3 CO 


£ 




rH 


CD 


O 


o 


CD 


o 


•H 


O 


o 


X 




PL, 


H-q 




CO 


S 


ad 


H 





73 



IV. 



CONCLUSION AND RECOMMENDATIONS 



A. CONCLUSIONS 

1 . Validity of Assumptions 

Throughout this thesis it has been stressed that 
the final conclusions would necessarily be general in na- 
ture. The assumptions which led to these conclusions must 
not be overlooked, for some of them may place significant 
restrictions on the applicability of the methods studied. 
If these assumptions are changed, introducing other dis- 
tributions or parameters, the methodology introduced here 
is equally valid in those circumstances. 

Some specific assumption which are thought to 



a. The composition of the selected group of fore- 
cast models. Some models not considered here may have proved 
superior to all of those selected. The conclusions there- 
fore are applicable only to the specific set of forecast 
models treated. 



b. The specific parameters selected for each 
forecast model. Winters and Brown have recommended certain 
values for use with their models for general purpose ap- 
plication. No such recommendation was found for the Theil- 
Wage model, for example. A specific series was generated, 
and the optimal parameters for that series were used for 
all series. Perhaps a more general set of parameters for 
the Theil-Wage model exists which would have made its per- 
formance superior to the other models. 



74 



c. The use of the uniform distribution and the 
range selected for each may have had significant impact 

on forecast model performance. These distributions do not 
necessarily represent any physical or economic process. 

d. Finally, it must be remembered that these 
conclusions result from simulated time series data where 
the generating parameters were accurately known, and no 
sweeping claims can be made for the results of any simula- 
tion . 

While it is perhaps proper to be skeptical of the specific 
conclusions to be made here, it is again suggested that 
much better assumptions can be made in the context of a 
problem, and when this comparison is repeated, the conclu- 

^ -j n r. ^ ns rnr.r T i rn c ^ho"i C hr mnrp civi f i rsr.f t rt rt nt prr.r- 

tical value to the forecaster. 

2 . Conclusions and Application s 

Based upon the representative results in Tables I 
and II, and the analysis in Section III, B, the following 
conclusions are made concerning forecast model performance 
for the tested series: 

a. The use of the least-squares model and the 
modified least-squares model for forecasting the Holt -Winters , 
Theil-Wage, and Brown series gave poor results, primarily be- 
cause these series tend to be non-linear for many of the 
chosen ranges of the distribution of the random variables. 

In attempting a "non-discounted" linear fit to data which 
appear quadratic over large periods, enormous errors result. 



75 



The lesson here, if one may be found, is that a linear fore- 
cast model such as least squares must not be applied to data 
which have a tendency to be non-linear. When such a case is 



suspected, the "discounted" linear models are to be prefer- 
red, since they tend to fit the data only locally, with the 
more distant observations given relatively little weight. 

(b) The performance of the Holt-Winters, Brown, 
and modified least- squares linear models on stationary (no 
trend) data is good, so unless no question exists as to the 
constant level of the random process it would appear desir- 
able to use a linear model. In the event that a trend oc- 
curs, the model will follow it well, and if not, it will 
still give good forecasts. If a series similar to the 

HLCdl X ICC jj i"i 1'iii _i_ o -L _L ^ L ~> l. 0 L/C Cll^UUlKUlCU, UJ1C IIIUO I 1 1CU 



least-square is to be preferred over the Holt-Winters or 
Brown models. This model is distinctly superior to all 
others when only the first four series in Table I are con- 
sidered. One reason for its superiority over the standard 
least-squares model is that the latter can generally only 
be expected to have, at best, a forecast error variance 
equal to the variance of the process about its mean. This 
is due to the model's attempt to forecast the mean value 
of the series (the least- squares regression line). The 
modified form further extends the simple least-squares 
model and adds a correction factor based on the known cor- 
relation and the last forecast error. With this added in- 
formation its forecast capability is much improved. That 



76 



this is only true when the random shocks are correlated, how- 
ever, may be seen from Table I. 

c. The Modified EWMA Model was designed to permit 
Wiener's linear predictor to adapt to changes in process 
level. It did not appear to achieve this goal very effec- 
tively, since the simple EWMA model performed better on 
every series except the one for which the modified model was 
intended to be optimal. The data tend to suggest that the 
Cox modified model is a special one with limited adaptive 
capability when applied as a result of misspecif ication of 
the underlying model. This, of course, is not in agreement 
with the prior expectations of this model as stated in Sec- 
tion IV B. 1, but after some reflection is not very sur- 



p i ± ± j i £ . 



sm - t _ -S _ , . J U Z ^ « d 

lilt/ iiiwu^/ l W a.o liiLCiiuou tu o x u u ^uuu^u- •-> j 



change in the mean level of a random walk. The rapid changes 
which occur in the linear model would not normally be expec- 
ted to occur in a random walk process, or the walk would lose 
its random property. Therefore the model was not intended 
to f o 1 1 ow a linear trend, and without further modification 
should not be used to attempt this. 

d. The results in Table I lead to a conclusion 
that the Brown and Holt-Winters models are comparable, with 
Brown's model showing slightly better overall performance. 
This tends to support Harrison's claim [Ref. 7] that Brown's 
model is preferred in practice due to its simpler construc- 
tion but comparable result. 

e. The measure suggested earlier in this study 
of an average specification error, and specification error 



77 



variance in selecting the best models tends to bias selec- 
tion in favor of the Theil-Wage forecast model (see Table 
III). The reason for this is that, although it did not 
perform as well on all other series as for instance the 
Brown model, the other models almost completely failed to 
follow the Theil-Wage series. As a result, the variance 
and average of specification error were inflated for all 
other models. The uniqueness of the Theil-Wage series, as 
evidenced by the relatively poor showing of other forecast 
models in predicting this series, is an excellent example 
of the need to make such model comparisons as this study 
has done. If one were restricted to using the other fore- 
cast models for predicting this series, it might be thought 



liia i, uic :> o a _L 5 vvclo lGG 



accurately forecast by any model. When such a series oc- 
curs in practice and none of the standard models seem to 
apply, a comparison of widely assorted forecast models may 
suggest a more appropriate form for use. 

An application of the comparison methodology pre- 
sented in this thesis might be made by a supply item manager, 
who is responsible for forecasting demands for stocked items 
and insuring that stockouts do not occur more frequently than 
some specified rate. The manager could select sample demand 
data which was representative of his stockage items, or if 
the differences in demand distributions for some items were 
too great, he might form two or more homogeneous groupings 
from which representative series were taken and select a 



78 



model for each. The forecast models could then be used to 
forecast the selected series and the measures of specifica- 
tion error calculated using the least forecast error vari- 
ance model which resulted. Of course, the forecast technique 
in use at the time of the comparison should also be included. 
If the method is a good one, this comparison will demonstrate 
clearly how good it is in terms of the added forecast error 
involved in using other less applicable methods. The inter- 
pretation of 100 times the specification error as the added 
percentage of forecast error variance caused by model mis- 
specif ication should be easily understood by all those in- 
volved in model selection. This approach is recognized to 
be strictly empirical, but for mass producing forecasts on 

»i v»r\ t I i i» K c i ** 4 "1 r'% o r- 4- /-,>«-*- f- 1 i v> h . , . • *r» Z * , 4- -»,“*« i *• <_ i > i cl Vr to 'A r 

not be justified in many cases. 

In the event that no model appears clearly superior to 
others, the comparison results still permit an intelligent 
model selection procedure. Consider the results in Table I. 
It is evident that when the linearity, serial independence 
and normality assumptions are satisfied the least squares 
method should always be used. Uncertainty about the satis- 
faction of these assumptions poses an interesting decision 
problem. If the least squares forecast model is used and 
the generating model of the forecast series is actually as- 
sociated \\rith one of the other forecast models, the forecast 
error variance "penalty" for use of the least squares method 
ranges from an additional 18.3% minimum to over 650,000%. 



79 



On the other hand, if one were to select the Brown model, 
the "penalty" would range from 11.4% if the least squares 
model should have been chosen, up to a maximum of over 2300% 
in a worst case. For six of the seven possible series, 
however, the Brown model would result in less than a 25% 
penalty. The "max-min" solution suggested by Table I is 
the selection of the Theil-Wage model, which never results 
in a forecast error variance penalty greater than 113% or 
just over twice the error variance, no matter which series 
is forecast. It may be noted that this model, though, al- 
ways causes at least a 60% penalty unless it happens to be 
the optimal model. If the forecaster had reasonably good 
information on the likelihood of occurrence of the various 



C i JL C O UUU ^ 



•cv v*rr; tr 






variance, a decision rule could be formulated to guide 
the selection of a preferred forecast model. 

An additional application which is actually only a 
variation of the approach used in this thesis, is the modi- 
fication of various generating models to introduce varying 
degrees of autocorrelation (or lagged variables in the least 
squares models) to determine their relative effects on the 
forecast models. Previous studies have shown a negative 
bias in variance of least squares parameter estimates caused 
by autocorrelation. A negative bias in the estimates them- 
selves occurs when 3 > 0 in the lagged variable case. These 
results may be shown analytically (p 211-221, [Ref. 12]). 

The studies have gone on to show, through simulation, that 



80 



a strong positive bias occurred when autocorrelation and 
lagged variables were both present. Such investigations of 
model interactions could make use of the methodology de- 
scribed in this thesis as a quantitative measure of results, 
and further add to the present understanding of economic 
time series data. 

B. RECOMMENDATIONS 

1 . Extensions of this Investigation 

Some recommended extensions of this thesis which 
may result in improved forecasting models or a better fore- 
cast model selection criteria are: 

a. A determination of the effect on the results 
of this study of distributions other than uniform. 

b. Investigation, along the lines suggested by 

A 

Cox [Ref. 5], of the advantages of replacing in Equation 
(C - 3) by some other form of moving average. 

It would be interesting to determine whether re- 
quirements such as minimizing the effects of long term trends 
could be used to choose between these alternative forms of 
moving average. Limited tests at the close of this study, 
where Brown's double smoothing model was substituted into 
Cox’s modified model instead of the simple EWMA model nor- 
mally used, resulted in significant improvements in fore- 
casts of series from the linear generating models. (Excluding 
the Theil-Wage series, the average specification error was 
reached from 1.58 to 0.58.) 

c. Investigation of "tracking signals." When such 
misspecification occur such as using the Holt-Winters model 



81 



to forecast the Theil-Wage series, a model modification 
which has been suggested by some [Ref. 6, 16, 17, 18) is 
the use of a tracking signal to serve as the "exception" to 
management that the forecast model is not able to perform 
satisfactorily. The forecast error tolerance of this sig- 
nal may be set at any level needed. Such signals may be 
used to obtain the attention of management, or as brought 
out in the references given, to adjust the smoothing con- 
stants to accommodate the series. This procedure probably 
has the most potential for handling the wide assortment of 
random process forecasts which is often required. The cal- 
culation of "optimal parameters" is obviously not a solution 
due to the variety of series encountered in practice. Em- 

v* t -v* r* t-i 1 ,1 'v~vn r. c t i c, t*i r*. -f "rrApprS i I \7 n r> H 1 ? r.3TJ5T r.f “I — *r c c n r h 

<* — V- - - - - o o - X ' ' - - 

as Winters and Brown have done, fails when gross specifica- 
tion error occurs, but a combination of a "good" parameter 
and a tracking signal adjustment would appear to be an ef- 
ficient solution to the problem. As an extension to this 
thesis it would be interesting to determine the degree of 
improvement gained by the addition of tracking signals 
to the models. Although such signals have generally been 
discussed in the context of EWMA models, the concept very 
likely could be applied to least squares models and improve 
their performance in those cases where the model assumptions 
are violated. Successful application of such refinements 
would contribute substantially to a more "automatic" fore- 
casting system and increase the population of time series 
for which any particular model may be successfully applied. 



82 



2 . 



Application of Methodology 



While it might be premature to suggest that any 
actions be taken on the sample results obtained in this 
thesis, it is maintained that the procedure used is sound, 
and could be applied immediately to a practical problem 
using actual data. Slight modifications must be made to 
the Fortran program such as substituting "read" statements 
for the series generators and a logical check added to 
determine the model with least forecast error variance for 
a specific series. The program would then give results 
which could be readily interpreted by forecast personnel. 

It is strongly recommended that NavSup or any other agencies 
using forecast models give consideration to use of this 
me the uuxu^y cio ct i x t v ciiUa tiOii Oi ulC iT caj 3tiT*^ jlCx"CCcI3G 
model. If the quantitative measure obtained justifies con- 
tinued use of the same model, then the procedure may be 
repeated using only versions of the same model with varied 
parameters as a sensitivity evaluation on the model. The 
small amount of computer time required to perform a com- 
parison (less than 30 seconds on an IBM 360-67) or sensi- 
tivity analysis is trivial compared to the potential increase 
in effectiveness of forecasts if it is discovered that some 
other model or combination of models are more satisfactory 
predictors of the random processes of interest. 



83 



ooonoo no no o o no nooooo ooooooo no 



c 

c 

c 

c 

c 

c 

c 

c 

c 

c 



c 

c 

c 

c 

c. 

c 

c 

c 

c 



A COMPARISON 0= SHORT TERM FORECAST MODELS 

THIS PROGRAM GENERATES SEVEN FORMS OF TIME SERIES WHICH 
ARE EACH FORECAST P Y SEVEN FORECAST MOO-ELS, ONE OF WHICH 
IS THE OPT I MAL ( MI MI *"JM M r AN SQUARED ERROR) PREDICTOR OF 
THE SERIES. THE SPEC! = IC ATI ON ERROR IS COMMUTED FOR 
EACH FORECAST MODEL ♦ FOR EACH TIME SERIES FORECAST. THE 
AVERAGE SPECI C ICATI ON ERROR MEAN AND VARIANCE ARE THEN 
CALCULATED AS AN OVERALL MEASURE OF FORECAST MODEL PERF. 

DIME NS! ON 0PSN(501 ) ,RNIJM(500 ) , 0 1 -LS E ( 500 ) , CIFS£A( 5D0) 
1 01 FMEAt 500 ) , D I P HLT ( 5C0) ,01 -P°N( 500) , S‘JMSO( 7) , 

2HLTLVL ( 50C ) , A CO? c ( 500 ) , D! F F ( 500 ) , 0 I =ML S ( 5 OC ) , 

3DIFTHL( 500) , THLLVL (500) ,SPEC C (7,7) , C 0M3MU ( 7 ) , 

4 SUMER ( 7 ) , V A R S P C ( 7) 

L OWL UP VALUE IS T^ : NUMBER OF INITIAL FORECAST 0 ER 10 DS 
WHICH APE D I S I GNA TED FOR c 0R“CAST STABILIZATION. AFTER 
THIS PERIOD, FC ST ERROR CONTRIBUTES TO VAR. CALCULATION. 
LOWL'JP SHOULD NOT B” LESS THAN 3 
LQWLU P= IOC 



MAX L IJP IS THE TOTAL NUMBER OF OBSERVATIONS G ENE R AT -_D ( OR 
READ IN IF T HO program. IS MODIFIED TO USE ACTUAL DATA). 
MAXLU^ SHOULD NOT FXCEEO 500 WITHOUT PGM MODIFICATION. 

M A XL UP = 400 



THIS INITIALIZES TH- U(0,1) GENERATOR 
X=URN(-5) 



•NUMBER' SP=C T FIES THE TYPE SERIES TO BE GENERATED. ‘DO 
1000* STE-S THROUGH ALL SERIES- 1= L SE SERIES, 2= MOD LS 
SERIES, 3= SIMPLE EMMA SERIES, 4= MOD EWMA SEPIES (EXPO 
AUTOCDPR.EL A TED ) , 5= HOLT-WI NTcRS LINEAR GROWTH SERIES, 

6= THEIL-WAGE LINEAR GROWTH SERIES, 7= BROWN LINEAR 

r f ri t r- 



i r\ . , . . 



U XU U ^ 



OBSN ( 1 )=0.0 
RNUM( 1 ) =0. 0 

GO TO ( 11, 66,22, 33, 44, 77,55) , NUMBER 

^ ^ 4^ 4^ 4/ %L* 4/- 4^ 4# 4/ 4# 4U 4# %L» 4> 4<* 4^ 4^ 4/ 4L* 4L> >V 4/ %•/ *J 

jy* ^ ^ ^ ^ #f» #*p> ^ ^ ^ -y* yv >j< ^ / 



## ijc i 



THIS MODEL G c N: RA T’" S A S po !ES FQP WHICH A ZERO-INTERCEPT 

LINEAR FORECAST MODEL IS OPTIMAL 

' B ' IS THE SLOPE value TO Bi: SELECTED AS DISIR5D. 

11 9=3.0 

RN S 1 G IS THE STD DEV C- NORMAL RV 
RNGS IG=3 .0 

THIS DO LOOP GENERATES THE DESIRED NUMBER OF RNS 
DO 100 1 = 1, M A XL UP 
SL!M=0 .0 
DO 20 1=1,12 

THIS IS THF UNI F( o,l) RN GENERATOR 
X=URN ( 2 ) 

R= X 

20 SUM=SUM+R 



GENERATOR WHICH PRODUCES A NORMAL (Ot X) RANDOM NUMBER. 
RNUM ( I ) = ( SUM-6 .0)*PNGSIG 

THIS COMPUTES THE DETERMINISTIC ° ART 0= THE LINEAR MODEL 
AND ADDS THE NORMAL RV TO IT. 

OPSN( I ) = B*!+PNHM( I ) 

100 CONTINUE 
GO TO 60 



X x X X X X > 



THIS MODEL TAKES St ME NORMAL RN * S USED ABOVE 
A FIRST ORDER AUTOREGRESSIVE SERIES FOP WHICH 
LEAST SQUARES FORECAST IS OPTIMAL. 



AND GENE PAT 
THE MOD 



84 



onnoonooo o o oooooo no no on ooonoooo 



c 

c 

c 



66 RH0=0 .4 
c D S=0. 0 

DD 101 1=1, M AX L UP 

THIS GENERATES CORRELATED RANDOM SHOCKS. SEE EQ.(A-4> 

EP S=RH 0# t: P S+RNIJM ( I ) 

THIS GENERATES THE DETERMINISTIC P AR T OF THE LINEAR MODEL 
AS BEFORE, BUT NOW THE CORRELATED SHOCK IS ADDED. 

OB SM ( I ) -B**+cps 
RNUM (I ) = EPS 
101 CONTINUE 
GO TO 60 









-r ^ 'c* t 



THIS MODEL GENERATES A SERIES FOR WHICH THE SIMPLE E WMA 
c OR EC AST MODEL IS OPTIMAL. 

•RNGGAM • IS THE RANGE OF RANDOMNESS ASSOCIATED 
WITH THE SHOCK TO PROCESS LEVEL SHOULD BE SET LESS THAN 
THE DEGREE ,SNG, A SSCC I ATEQ WITH THE RANDOM *SHOCK ADDED . 
22 RNGGAM=5.0 



DEGREE 0= RANDCMMESS, RNG ,OF UN I F RV SHOULD ALSO BE SET 
R N G= 1 2 . 0 
MU =0.0 

THIS IS A UNI c (-X.O,X. 0) PNOM GENERATOR. LOOP PROVIDES 
THE NUMBER OF RMS DESIRED. 

DC 300 1=1 , MAX LHP 
X=URN ( 2) 

RNUM ( I ) = ( X-C .5 ) *F.NG 

AN IMA PROCESS IS GENERATED USING THE RANDOM SHOCK FORM 
OF THE MODEL SHOWN IN L0.(B-4). 

OBSM ( I )=KU+PNU M ( I ) 

X=URN ( 2 ) 

BIT= ( X-0. 5 )*P MGGAM 
MU=MU + B I T 

n m i j r.A / t \ — p vi yj. i T I ±dt t 

300 0 0 i 4 *1 u 

GO TO 60 



«JU sC V 1 / J# ^ vU X vC %•/ O/ ^ ** Oy >0 ^ ^ O# sC \*/ J/ 

^ ^ v -y* n' -r 'r -r ¥ ^ t T n' o* v 'r* ^ -v -o -v* -nr* / H n* v -v *6' or* 'r v -Y* *y* -** «- *• -y* *r* -y* v 



THIS MODEL GENERATES A SERIES FOR WHICH COX'S MODIFIED 
FORECAST M ODE l IS OPTIMAL. COP COtFF=RHO**LAG. HERE L AG=1 

1 RHO 1 ' IS THE DESIRED CORRELATION COEFFICIENT 
33 RHO 1=0.4 

CRH0= 1 .O-RHCl 
UNI RNG= 10.0 

THIS IS A UMI p <-X».X) RNDMM GENERATOR, LOOP GENERATES 
THE PROPER. NUMBER OF RMS DESIRED. 

DO 200 1 =1 , N AX L'J P 
X=URN( 2) 

RN'JM( T ) = ( X-0 .5 ) *HN I R m G 

0BSN(I+1 )=RH01TCBSN (I )+CRH0*RNUM( I } 

200 CONTINUE 
GO TO 60 



■i, J/ ,1. J, J, •v 1 . V, »l, %!, »’■> 

V v -»'• r 'r 'r v ■r 'i' <v 



v, J. V/ ^ , 



THIS MODEL GENERATES A SERIES FOP WHICH THE HOLT TWO-PtRA 
METER MODEL IS OPTIMAL. SLOPE EXPERIENCES A RNCM CHANGE. 
THERE IS ALSO t PNDM MOVEMENT ABOUT T HO LEVEL OF THE 
PROCESS CCNTRI R! JTI NG TC THE FORECAST ERROR 

THE DEGR c.~ 0 C FaNOQMNFSS f, uST BE SET FOR THE 
RANGE p- THE FORECAST ERROR, RNGl-RF 
44 RNGERR =10.0 
S L0PE= 1 .0 
VP C MU =0.0 
PC SLVL = 0. 0 
ALFA=0 .2 
ALFA 1=0 • 1 
DO 400 !=2,MAXLUF 
X=URN < 2 ) 



85 



oooooo o o o oooncv 



c 

c 

c 



c 

c 

c 



c 

c 

c 

c 

c 

c 

c 



c 



[ 



THIS GENERATES THE U(-X,X) FCST ERROR 
RNUM (I ) = ( X- C . 5 ) * P MG E.P P 

THIS COMPUTES THE n 8S6P VED VALUE OF THE PROCESS WITH A 
RANDOM FORECAST - - OR INCLUDED. 

0 8S M ( I ) = °C SLVL +R NU V ( ! ) 

VECMU=VECMU4SL OPl' + AL = A*PNUM( I ) 

THIS COMPUTES K- W VALUE OF SLOP 0 DUE TO RNDM c ROCZSS CHNG 
St. 3Pr=SLO°r+ 4L FA* A l FA1*RNUM( I ) 

THIS COMPUTES NEW LEVEL C QR SERIES BASED ON LATEST EST OF 
SLOPE PLUS THE R N D M CHANGE FOR THE PERIOD 
PCSLVL=V ECMU+S LO D c 
A 00 CONTINUE 
GO TO 60 






THIS MODEL GE NRRA TF S A SERIES FCF WHICH THE THE I L-WAGI: 
FORECAST MODEL IS OPTIMAL. 



THE RANGE OF UNCERTAINTY OF FCST ERR DR AND SLOPE CHMGE 
MUST BE SET TO PROPERLY SPECIFY THE UNDERLYING PROCESS. 
77 YLVL=0. 0 
SL P=0 .0 
EPSRNG=10. 0 
DFLR NG = 4 . 0 

THE GENERATING MODEL CORRESPONDS TO EO . ( D- 1 ) 

DO 450 1=1 , ^ AXLUR 
X=URN< 2 } 

B I T 1= ( X— 0 . 5 ) *d:lrng 
SLp=SLP+BI T 1 
YLVL=YLVL + S'. r ' 

X=UPN( 2 ) 

B I T2= ( X-0. 5>*RPSRNG 
OBSN ( I ) = YL VL +P : T 2 
R N ■ J M ( I ) =BI T 1 +B IT2 
450 CONTINUE 

'JO I 0 

^ ^ V IT T 'r* T ^ T # ^ T ^ V T T T r •* r »• « V . ■ , » T . V « T T V */ V V ^ f ** 



THIS MODEL GENT 5 ATLTS A SERIES FOR WHICH BROWN'S ONI 
PARAMETER MODEL IS OPTIMAL 



THE FOLLOWING INITIAL CONDITIONS AND PARAMS MUST BE 
55 RNGERR=10.C 
AL c A~0 . 1 
BSLOPE= 1 .0 
BRNLVL=0.0 
FC SB p N = 0 . 0 
0ETA=1 .O-ALFA 
BFTASQ=BETA*3ET A 
ALFASQ=AL C A*AL-A 
DO 500 I = 1 * MAXLUP 
X=URN ( 2 ) 

RNUM (I ) = ( X-C. 5 ) *5NGPFR 
QBSN ( I )=FCS* 3 F:N + RNIJM ( T ) 

THIS CONTRIBUTES i RANDOM CHANG" TO PROCESS LEVEL 
BR.NL VL = BRNL VL + 3 SL 0 °‘i + ( 1 . 0- n * TA SO ) * R NUM ( I ) 

THIS CONTRIBUTES A RANDOM CHANGE TO SLOPE 
BSL np F=B SLOP ?+ At -4 SC* c NUM ( I ) 

THIS LEVEL CONTRIBUTES TO NEXT PERIOD OBSERVATION 
p CS 8RN=RRNLVL+ BSLQPfc 
5 CO CONTINUE 



O' O' O' O' O' O' 



^ ^ 2 % ip ^ ^ * )}% ijc ip 



^ > 



SET. 



THIS SECTION COMPUTES MEAN, VARIANCE L AUTCCOF 5 ELATI ON 
OF THE OBSERVATIONS. THE CORRELATION BETWEFN SUCCESSIVE 
OBSERVATIONS, D H01 , IS USED IJ SUBSEQUENT FORECAST MODELS 



60 N08SN=MAXLUP 
MAXLAG=5 
K= 2 

3 GO TO (5,4,99) ,K 
5 K = 3 



86 



c 

c 



c 



c 

c 

c 

c 

c 

c 



c 

c 

c 

r 

c 



WRITE ( 6, 2499 )N UMBER 

2499 FORMAT ( 1 H ,/////, 1 OX ,' ST AT 1ST I CAL CHARACTERISTICS OF', 
1' THE GENERATED Sir RIFS' , 5 X , ' NU M E £ R = • ,13,//) 

GO TO 9 
4 K = 1 

WRITE! 6,249 8) NU MP F R 

2498 FORMAT! 1H ,/////. IOX. • STATI STI CAL CHARACTERISTICS OF'. 
1« THE RNDM SERIES' , 5X » ' NUMBER 55 ',13,//) 

XSUM=0. 0 

00 12 1 = 1, M;\XL'I° 

12 XSUM=XSUM+RMJM ( i > 

X 9 AR= X S UM / ( NOE SN * 1 . 0 ) 

VAR=0 . 0 

DO 21 I=1,NC8SN 
D I FF ( I ) -RiNU-’’ ( I )-X3AR 
21 VAR=DIFP(I )**2+VAR 
GO TO 24 
9 X S U M= 0 . 0 

DO 10 I =1 , MAXLOP 
10 XSUM=03SN( I J+XSUM 

XBAR=XSUM/ ( N0BSM*1 .0) 

VAR=0. 0 

DU 23 I = 1 , N 0 3 S N 
D I FF ( I ) =OBS N ( I ) -X BAR 

23 VAR=DI F — ( I ) *DI f.= ( I ) +VAR 

24 VAR=VAP/(N05SN*1.0) 

VAR IS VARIANCE OR AUTOCOVARI ANCf: FCTN WITH ZERO LAG. 

2 5 DO 4 0 J = l,vAXL>G 

L I M L U 3 IS THE UPPER LPO° LIMIT WHICH DLCRLA Si: S WITH LAG. 
LI MLUP= NC3SN- J 
AC VF = 0. 0 
DO 30 1 = 1, L I V L UP 
XPROD=DIFF< I )*OIFF( I+J) 

30 AC VF =XPRCD+ACVF 

ACVF = ACV C / (NOBS N *1.0) 

ACVF I c THE A'JTOCOVA' T PIIHET TON WITH LAG J. 

* rv r- . . » « ~ . i . S\» 

\ J r ^ v ’ / v-4> 

ACORP IS THE S'J TOCORRcLAT ION FCTN, T He nORHmL I Hid ACVF. 
WRITE (6, 2501) J,4C 3RF ( j ) 

2501 FORMAT !1H, /IOX, '-AUTOCORRELATION COEFFICIENT WITH LAG' 
1,13,'= • ,F7.5) 

40 CONTINUE 

RH01= ACOR c ( 1 ) 

WRITE (6 ,26 CO )XRAR.VAR, RH01 

26C0 FORMAT ( 1H,/5X, *X3AR=' , F 10. 5 , 5X , • VAR=* , r -l 0 . 5 , 5X , • P HOI = • 
1 , p 7 . 5 , / / ) 

GO TO 3 



^ 



>’ + V v% C/ w %•/ JU ^ v 

■*p V "P V *Y s T 1 V ^ T V 'P -'p >p ^ -Tp -p -n 



THIS IS THE LSc FORECAST MODEL (Z p RO INTERCEPT FORM ) 

THESE INITIAL CONDITIONS SFFCI=ICALLY ANTICIPATE THE 
LSE SERIES GENERATED EARLIER. 

99 A HA T= 0.0 
SUMXY= 3 
SUMXS0=1.0 
DIFLSF ( 1 )=C.O 
FCSLSE=3*2 .C 
FC SML S= 0. 0 
R 1-0=0 . 4 

DO 600 I = 2 , v AXL '.IP 

FORECAST RROR IS CALCULATED A N D ACCUMULATED FOR LATER. 
DIFLSF! I ) = FCSLS r: -DPSN(I) 

OIFMLS! I )=FCSMLS-G3SN( I ) 

BEFORE COE C R I C T EN TS CAM Be £ STI MAT ED , XB AR AND YBAE MUST 
BE CALCULATED. 

S U M X Y = S IJ M X Y + I * 0 3 S N ( I ) 

SUMXSQ= SUM X S?+ I * I 
BHAT=SUMXY/SUMXSO 

NOW THE POST CAN BE MACE, USING THE COEFFICIENTS, AH AT 8 3 HAT 
FCSLSE=AHAT+SHAT*( 1 + 1 ) 



87 



oooo o o o oooo o oo oooo o oooooo 



c 

c 

c 

c 

c 

c 



THIS IS THE MODIFIED LE'-ST SQUARES FORECAST MCDEL. 

SINCE IT US ES TR; S AN F D AR AMET EP ESTIMATES AS THE LSF 
MODEL PLUS A CORRECTION, ITS FORECASTS ARE GENERATED AT 
THE SAME TIMS AS THE L Sr v ODEL. T^IS IS THE SAME AS £0. 

{ A-6 ) » EXCEPT SIGN CHANGE DUS TO DIFFERENT FCST ERROR FORM 
FCSML S= r C SL SS- D HO*DI c LSE ( I ) 

600 CONTINUE 



, o, * 



THIS IS THE SIMPLE EWMA FORECAST MODEL WITH OPTIMAL ALFA 

PC SI IS THE INITIAL FORECAST NEEDED TO START THE SIMPLE 
EXPONENTIALLY WEIGHTED MOVING AVERAGE C CRECAST SERIES 
C CS1=0.0 

AL FA = ( 3. O-RHO 1-1.0) /( 2. 0*RH01 ) 

IF (FH01 .LH.C .3333 ) ALFA=0.075 
BETA=1 . O-ALF A 
DO 700 I = I , w A X L U P 
DI FSEA ( I ) = - C S 1 - 3 B S N ( I ) 

THIS GENERATES FORECAST FOR NEXT PERIOD USING SIMPLE EWMA 
FCS 1 = AL ~A*OFSN ( I ) + 3 c TA*FCSl 
700 CONTINUE 



: 



: ^{c ric 



THIS IS THE MODIFIED pwma FORECAST MODEL PROPOSED BY COX 

ALFA=0 . 01 
BcTA=l .O-ALF A 
FCSEMjS =0.0 
CO 800 1 = 2 , M AXL UP 

THIS GFNEFATES FORECAST FOP NEXT PER ! DO USING SIMPLE EWMA 
AND OPTIMAL ALFA ASSUMING EXPONENTIAL AUTOCORRELATION. 

FCSFMA =.? L FA*OBSN ( I- 1 ) + B.ETA*FC3EMA 
THIS US*S SIMPLE EWMA IN OCX’S MOOI=IE0 EWMA « 

PC S'TA — - v' C: 1 '‘'C n SN ' I - 1 } { i o-Ewni :*Pf Ff-v. i 

L 1 P ^ z 'a l i i - S ' 1 .<-» " ^ *i jR i I j 

8 OC CONTINUE 



y# y* ■J* ^ ju «■/ j# %•# j# ^ ^ • v ■w x vC ^ vi> \t# ju % 

^ ^ ^ >1% ^ ^ ^ <y* Y 4 /,v /,% «|W ,% ‘ 4 v ^ /,> # 



THIS IS THE HOLT TWO-PARAMETER FORECAST MODEL 

AL C A=0 .2 

BE TA = 1 . O-ALFA 

BSL0PE=0.0 

HLTLVL ( 1 )=0.0 

FC SHL T=0 . 0 

AL FA1=0 . I 

BETA1=1 .G-ALFAI 

DO 900 I =2, MA XL UP 

DIFHLT ( I )==CSHLT -OBSN(I) 

THIS ESTIMATES THE CURRENT PROCESS LEVEL 

HLTLVL < I ) = AL=A*OBSN( ! ) +BETA* (HLTLVL( 1-1 ) +3 SLOPE ) 

SLOPE IS UPDATED =QR USE ON NEXT ITERATION. 

BSLGPE = AL FA 1* (HLTLVL ( I >-HLTLVL( I — 1 ) ) + BETAl*8Sl P®E 

FORECAST IS GENERATED USING CURRENT LEVEL AND SLOPE EST 
FCSHl.T = HLT L VL ( I )+BSLCPc 
9C0 CONTINUE 



X ^ ^ * 



THIS IS THE THcIL-WAGc FORECAST MCDEL. 

FCSTHL=0.0 
ALFA=0. 596 
AL FA D =0 . A2 5 
BETA=1 . O-ALP A 
BE TAP = 1 . C— A L c A 0 

facalf=alf; /u .o-bsta*alfap) 

FACBET=5ET A* BE TAP/ ( I .O-BETA^ALFAP ) 
THLLVL ( 1 ) = C. 0 
S LP=0 . 0 

DO 925 I =2 i MAXLUP 



88 



ooo oooo ooo o oo o o oooo 



925 



D I C THL ( T )= c CSTHL-ORSM( I ) 

THLLVL ( I ) = r AC A L F* 0 RSN ( I ) +F ACBET * ( THLL VL ( I-ll + SLP) 
S L P =A L F A 9* ( THL L VL ( I >-THLLVL(I-l ) )+BF.TAP*SLF 
FCSTHL=TF.LLVL ( I ) +SLP 
CONTI MU." 



. U- si, i. sf, 






* 'p v "i* <r * 



THIS IS BP OWN * S ONE-PARAMETER FORECAST MODEL 



AL F A=0 . 1 

BFTA=1 .0-ALFA 

SNGL SM = 0 . 0 

CBLSMC=0 .0 

BSLQFH = 0« 0 

FC SBP N = 0 . 0 

DI FPRN( 1 ) = 0 .0 

FACTDR=ALF-/5FTA 

DO 950 1=2, f'AXLU- 

0 1 FBP N ( I )= .-CSBRN-0 BSN ( I ) 

THIS COMPETES THE SINGLE (1ST 0° DER ) SMOOTHS 
SMGLSM =ALFA*OBSN( I ) +BSTA*SMGLS M 
THIS COMPUTES THE 2ND "'R DER SMOOTHED cSTIMAT 
dblsmo =alfa*skglsm + B ! :TA*DBLS v C 
SL0FE =FACT3P#< SMGLSM -DBLSMO ) 
FCSBRN =2 .O^SMGLSM -DBLSMC+ SLOPE 

950 CON T I NUT 



D 



ESTIMATE 
OF SERIES 



•X, a ^ ^ vW V4. vW ^ nI L r mS+ JU 

•rf* v 'V' *i*- 'C+ •»* ^ *|% ^ 



OC 2999 J = l,7 
SUMS 0 ( J ) = 0 .C 

2999 SU^S0( I ) =0 . 0 

THIS COMPUTES SUMS OF FORECAST ERRORS AND SQUARED ERRORS . 
DO 3000 I = L u WL U p , ^ A X L U P 
SUMER (1 5 =SUF ER ( 1 ) +Q I FLSE ( I ) 

SUMSQf 1 ) =SU^S0 ( D+DI FLSF- ( I )**2 

• » w r- - / -> ) _C I if./ Mp f O \ jl n r “M ! Q < T I 

-•> U '*i S 1 m ? ) — S • i •" G 1 { c. J • -j i " L p ( 2 ) 2 

SUMfcP ( 3)=SU : -‘ : R ( 3 ) + 0 T F S ~ A ( 1 ) 

S U M S 0 { 3 ) - S'J'-'SQ ' 3 ) + 0 1 f s E A ( I »**2 
SUMER ( 4 ) = SU>* ‘R ( 4 ) +0 I = M IA ( I ) 

S UM S n ( 4 ) = S J' 1 SO ( 4 )+DIFM-;a( ! ) **2 
SUMFP (5 )=SUMFR (5 ) +D1=FLT ( I ) 

SUM SO ( 5) =S'JMS0( 5)+DIFHLT( T ) **2 
SUMER (6 )=SU m :-R (6)+DIFTHL( J ) 

SUMSO(6 ) =S'JMSQ(6 > + DT c THL ( I )**2 
SUMER ( 7)=SL' , :-R(7)+r IFBPNd ) 

3000 SUMS0(7 ) = SU;MSO ( 7 ) + D I FBRN ( I )**2 

THIS COMPUTES ESTIMATED FORECAST ERROR MEAN AND VARIANCE 
FCR EACH =JF COAST MODEL. 

VARFaC=MAXLUP-L 0WLUF-1 .0 
DO 4000 J=l, 7 

SUMER (J)=SUf-**R(J ) / ( VAP FAC+ 1.0) 

4000 SLMSP( J) =SUU SO ( J) /VAR FAC 

THIS COMPUTES THE MEASUREMENT OF SPECIFICATION ERROR, 

THE RATIO OF FORECAST MODEL'S FORECAST ERROR VARIANCE 
TO THE VAPIANC'- OF THE OPTIMAL V QDEL FOR THAT SERIES. 

DO 5000 J = l,7 

50 CO SPEC" ( NUMBER ,.!) = ( SUMSO( J ) /SUMSQ ( NUMBER ) ) -1 .0 

WRITE (6.5001 ) ( L. SPOCF (NUMBER , L ) * L . SUM SO ( L ) ,L .SUMER 
1 ( L) ,1 = 1 , 7) 

5001 FOR'*AT(lh , /,5X,'S°EC ERROR (',12,' ) = • , 2X , - 1 0 . 5 , 

1 5XUVAPC , 12. ' )=’ ,2X. c 12.4, 5X, 'AVG ERR ( • , I 2, ' ) = ', 

22X,F10. 5,/> 

THIS PRINTS OUT T t J E OBSERVATIONS AMD THE FORECAST ERRORS 
EXPFPIENC't BY EACH FORECAST MODEL. 

WRIT 1(6, 2099 WlP^rR 

2099 FORMAT ( 1 H ,////, 7X , 'SUMMARY OF FORECAST RESULTS FOR 
1 * THE GENERATED S^R I F S' , 5X , • NUMBER = *,13,///) 

WRIT’K 6, 2100 ) 



89 



2100 FOR MM ( 1H ,//,5Xr' SERI C S OBSN' , 4X , • DI FF L SC. ',4X, 
l'DIFF M LSr *,5X, *01 FF ScWMA* ,5X,*DIFF MFWMA',5X, 

2 ' DI C F HOLT 1 . 5 X . 'DIF- THILL 5X , ' DI FF 3 R C 1 W N ' , /) 

WRITE(6,2200) (OBSN(L) ,OIFLSc(L) ,DIFMLS (L) ,niFS=A(L ), 
1 DI FMFA ( L ),DIFHLT(L ) , 0 1 F THL ( L ) , D I F B R N i L ) ,L=LCWLUP, 
2MAXLUP ) 

2200 FORMAT ( 1H , 8< 5 X , = 1 0. 4 ) ) 

1000 CONTINUE 
NU M EER=7 
FNUM=NUMPPR*1.0 

• THIS COMPUTES THc AVERAGE SPEC ERROR FOR EACH OF THE 
: FORECAST MODEL S,J , OVER ALL SERIES ,1. 

DC 1100 J=l, NUMBER 
C C M S 1 J M = 0 . 0 
DO 1111 1=1, NUMBER 
1111 CO MS 1 ) M=C CM S'.JM+SPSCF ( I ♦ J ) 

CCMBMU ( J) =CCMSUM/ = NL)M 



VARIANCE 



C 

C 



THIS COMPUTES THE SPECIFICATION F R c Q R SAMPLF 
FCR EACH FORECAST MODEL, J, OVER ALL SERIES, I. 

ESQSUM= 0 . 0 
DO 1112 I = 1 » NIJ M ° F R 

1112 ESQSUM=liSQSUM+ ( SP-C.F (I , J ) -COMB MU ( J ) )**2 
VARS°C( J )=ESOSUM/FNUM 
11 CO CONTINUE 

THIS SUMMARIZES THE S°ECJ FICATION ERROR DATA FOP CO'-PAR. 
WRITE (6 ,3100) 

3 1 CO FORMAT ( 1H ,////, 20X ,* LSL MDL • , 8X , • ML Sc: M['L* 7X,*cWM/ M 



M. 



CL 1 » 4X« * THL- WGF MQL',4X, 



1 7 X , • M ,;W M A MDL • , 6 X , • HL T- WN T 
2'BRK MDL'/) 

DO 3111 1=1, NUMBER 

3111 WRITE (6, 3110) I , ( S P E C F ( I , J ) , J = 1 , NU M BE R ) 

3110 FORMAT (1H , 6 X, ' SEP IE S ( ' ,12 , ' ) ‘ , 2X , 7 ( F9 . 5 , 6X ) //) 



» ; n t t i 



i V- 



.■)0 DU * x ; < A i v jl n i / / / / » lu A f 

1 ' COMPAP. I SON* ,/} 

WRTTE(6,312C > (L,COM«MU(L) ,VAP$I'C(l ) ,L = 1, NUMBER) 

3120 FORMAT ( 1H , 1 OX ,' FORECAST MODEL (' , 12, ' >',5X,F9.i>,10X, 
IF 10 . 5 ) 

STOP 

END 



90 



LIST OF REFERENCES 



1. Zehna , P. W. , Probability Distributions and Statistics , 

p. 477-501, Allyn and Bacon, Inc., 1970. 

2. Coventry, J. A., A Comparison of Demand Forecasting 

Technique s , M.S. Thesis, US Naval Postgraduate School, 
March 1971. 

3. Box, G. E. P. and Jenkins, B. M. , Time Series Analysis , 

Forecasting and Control, p. 103-108, Holden-Day, Inc., 
1970. 

4. Muth, J. F. , "Optimal Properties of Exponentially 

Weighted Forecasts," Journal of the American Statistical 
Association , p. 299-306, June 1960. 

5. Cox, D. R., "Prediction By Exponentially Weighted Moving 

Averages and Related Methods," Journal of the Royal 
Statistical Society , B23, p. 414- 442 , 1966 . 

6. Brown, R. G., Smoothing, Forecasting and Prediction , 

Prentice Hall Inc., 1963. 

/• Hal I JLPUil J i. . * LiAHOilwil C1UJ. c^iil Ou UjiC l V. tC x'iTl 

Sales Forecasting," Management Science, v. 13, No. 11, 
July 1967. 

8. Naylor, T. H., and others, Computer Simulation Techni - 

ques , p. 118-121, Wiley, 1968. 

9. Holt, C. C. , Forecasting Seasonal and Trends by Expo- 

nentially Weighted Moving Averages , Carnegie Institute 
of Technology, Pittsburgh, Pennsylvania, 1957. 

10. Winters, P. R. , "Forecasting Sales by Exponentially 

Weighted Moving Averages," Management Science , v. 6, 

No. 3, p. 324-342, April 1960. 

11. Theil, H. and Wage, S., "Some Observations on Adaptive 

Forecasting," Management Science , v. 10, No. 2, 

January 1964. 

12. Johnston, J., Econometric Methods, p. 177-199, McGraw- 

Hill Inc., 1963: 

13. Bossons, J., "The Effects of Parameter Misspecification 

and Nonstationarity on the Applicability of Adaptive 
Forecasts," Management Science, vol. 12, No. 9, May 
1966. 



91 



14 . 



Box, G. E. P. and Jenkins, G. M. , "Some Statistical 
Aspects of Adaptive Optimization and Control," Journal 
of the Royal Statistical Society, B, 24 (2), p. 297, 
1962. 

15. Gilchrist, W. G., "Methods of Estimation Involving 

Discounting," Journal of the Royal Statistical 
Society , B, 29, No. 2, p. 355-369, 1967. 

16. Trigg, D. W. , "Monitoring a Forecasting System," 

Operational Research Quarterly, v. 15, p. 271-274, 
1964. 

17. Trigg, D. W. and Leach, A. G., "Exponential Smoothing 

with an Adaptive Response Rate," Operational Research 
Quarterly , v. 18, No. 1, p. 53-59, 1967. 

18. Rao , A. G. and Shapiro, A., "Adaptive Smoothing Using 

Evolutionary Spectra," Management Science , v. 17, 

No. 3, p. 208-218, November 1970. 



92 



INITIAL DISTRIBUTION LIST 



No. 



1. Defense Documentation Center 
Cameron Station 
Alexandria, Virginia 22314 

2. Library, Code 0212 
Naval Postgraduate School 
Monterey, California 93940 

3. Chief of Naval Personnel 
Pers lib 

Department of the Navy 
Washington, D. C. 20370 

4. Naval Postgraduate School 
Department of Operations Research 

and Administrative Sciences 
Monterey, California 93940 

5. Professor P. W. Zehna, Code S5Ze 

o -r\ n *T* r; , f 0 r> o v . ^ • l ~“ r> ,-v o -v> r ! * 

and Administrative Sciences 
Naval Postgraduate School 
Monterey, California 93940 

6. MAJ Ralph E. Hayes, USA 
3902 Commander Drive 
Columbus, Georgia 31903 



Copies 

2 

2 

1 

1 

1 

1 



93 



UNCLASSIFIED 

Security Classification 



DOCUMENT CONTROL DATA - R & D 

I Securi ty classification of title, body of abstract and indexing annotation must be entered when the overall report is classified) 
\ originating ACTIVITY (Corporate author) 2a. REPORT SECURITY classification 



Naval Postgraduate School 
Monterey, California 



UNCLASSIFIED 



2b. GROUP 



3 REPORT TITLE 

A COMPARISON OF SHORT-TERM FORECASTING MODELS 



4 DESCRIPTIVE NOTES (Type of report andjnclusive dates) 

Master's Thesis: September 1971 



j au T HO R IS) (First name, middle initial, last name) 

Ralph E. Hayes 



6 REPOR T DATE 

September 1971 




la. TOTAL NO. OF PAGES 

95 


7b. NO. OF REFS 
18 


8a. CONTRACT OR GRANT NO. 
b. PROJECT NO 




©«. ORIGINATOR’S REPORT NUMBER(S) 


C. 

d. 




©b. OTHER REPORT n O ( S) (Any other numbers that may be assigned 
this report) 


10 oistribution statement 

£ -r\ p y r\ yp A -fr\ ~r r» 1 lb 1 i C 


Thbscp; rli Qfrihiitinn unlimited. 


11. SUPPLEMENT ARY NOTES 




12. SPONSORING MILITARY ACTIVITY 

Naval Postgraduate School 
Monterey, California 93940 



13. ABSTRACT 



Seven short-term forecasting models, two using least- 
squares estimation methods and five employing variations of 
the exponentially weighted moving average method, are com- 
pared in their relative ability to produce minimum error 
variance forecasts for seven simulated time series. Each 
series was generated to enable one of the forecast models 
to be the least squared error predictor. A comarison 
methodology is developed which facilitates forecast model 
performance through the measurement of model specification 
errors. A computer program is presented which may be 
modified to accept real time series and which permits the 
forecast models to be ranked in order of their relative 
specification error. 



, F r..l473 

/N 01 01 -807-601 I 



(PAGE I ) 



UNCLASSIFIED 



94 



Security Classification 



A- 3 1408 



11 NCI AS ST FT ED 

Security Classification 



key WORDS 



Forecasting Models 

Model Specification Error 
Measurement 

Time Series Simulation 

Forecast Model Performance 
Comparison 

Adaptive Forecasting 
Prediction Methods 
Exponential Smoothing 
Estimation 



LINK 



ROLE 



C 



W T 



FORM -470 

t NOV I ^ 

>1 *507-6621 




(BACK) 



UNCLASSIFIED 

Security Classification 



95 



A- 3 1 409 



131452 



Thes i s 

H4056 Ha V es 
c.l A comparison of 

short-term forecasting e 
mode Is. 2 

~b j jiuiiar 

3 JUt 73 210 1 f? 

I ' h.jV 77 2 5 2 8 6 

1 12 ttC ft 3 3 0 9 9 - 

f 



Thesis * 3} 452 

H405& Hayes 

c.t A comparison of 

short-term forecasting 
models. 



thesH4056 

A ™iu» P »ni SOn ° f shor, ' term forecasting m 




3 2768 001 02067 0 
DUDLEY KNOX LIBRARY 



